How to detect string tone from FFT - audio

I've got spectrum from a Fourier transformation. It looks like this:
Police was just passing nearby
Color represents intensity.
X axis is time.
Y axis is frequency - where 0 is at top.
While whistling or a police siren leave only one trace, many other tones seem to contain a lot of harmonic frequencies.
Electric guitar plugged directly into microphone (standard tuning)
The really bad thing is, that as you can see there is no major intensity - there are 2-3 frequencies that are almost equal.
I have written a peak detection algorithm to highlight the most sigificant peak:
function findPeaks(data, look_range, minimal_val) {
if(look_range==null)
look_range = 10;
if(minimal_val == null)
minimal_val = 20;
//Array of peaks
var peaks = [];
//Currently the max value (that might or might not end up in peaks array)
var max_value = 0;
var max_value_pos = 0;
//How many values did we check without changing the max value
var smaller_values = 0;
//Tmp variable for performance
var val;
var lastval=Math.round(data.averageValues(0,4));
//console.log(lastval);
for(var i=0, l=data.length; i<l; i++) {
//Remember the value for performance and readibility
val = data[i];
//If last max value is larger then the current one, proceed and remember
if(max_value>val) {
//iterate the ammount of values that are smaller than our champion
smaller_values++;
//If there has been enough smaller values we take this one for confirmed peak
if(smaller_values > look_range) {
//Remember peak
peaks.push(max_value_pos);
//Reset other variables
max_value = 0;
max_value_pos = 0;
smaller_values = 0;
}
}
//Only take values when the difference is positive (next value is larger)
//Also aonly take values that are larger than minimum thresold
else if(val>lastval && val>minimal_val) {
//Remeber this as our new champion
max_value = val;
max_value_pos = i;
smaller_values = 0;
//console.log("Max value: ", max_value);
}
//Remember this value for next iteration
lastval = val;
}
//Sort peaks so that the largest one is first
peaks.sort(function(a, b) {return -data[a]+data[b];});
//if(peaks.length>0)
// console.log(peaks);
//Return array
return peaks;
}
The idea is, that I walk through the data and remember a value that is larger than thresold minimal_val. If the next look_range values are smaller than the chosen value, it's considered peak. This algorithm is not very smart but it's very easy to implement.
However, it can't tell which is the major frequency of the string, much like I anticipated:
The red dots highlight the strongest peak
Here's a jsFiddle to see how it really works (or rather doesn't work).

What you see in the spectrum of a string tone is the set of harmonics at
f0, 2*f0, 3*f0, ...
with f0 being the fundamental frequency or pitch of your string tone.
To estimate f0 from the spectrum (Output of FFT, abs value, probably logarithmic) you should not look for the strongest component, but the distance between all these harmonics.
One very nice method to do so is a second (inverse) FFT of the (abs, real) spectrum. This produces a strong line at t0 == 1/f0.
The sequence fft -> abs() -> fft-1 is equivalent to calculating the auto-correlation function (ACF) thanks to the Wiener–Khinchin theorem.
The precission of this approach depends on the length of the FFT (or ACF) and your sampling rate. You can improve precission a lot if you interpolate the "real" max between the sampling points of the result using a sinc function.
For even better results you could correct the intermediate spectrum: Most sounds have an average pink spectrum. If you amplify the higher frequencies (according an inverse pink spectrum) before the inverse FFT the ACF will be "better" (It takes the higher harmonics more into account, improving acuracy).

Related

creating audio file based on frequencies

I'm using node.js for a project im doing.
The project is to convert words into numbers and then to take those numbers and create an audio output.
The audio output should play the numbers as frequencies. for example, I have an array of numbers [913, 250,352] now I want to play those numbers as frequencies.
I know I can play them in the browser with audio API or any other third package that allows me to do so.
The thing is that I want to create some audio file, I tried to convert those numbers into notes and then save it as Midi file, I succeed but the problem is that the midi file takes the frequencies, convert them into the closest note (example: 913 will convert into 932.33HZ - which is note number 81),
// add a track
var array = gematriaArray
var count = 0
var track = midi.addTrack()
var note
for (var i = 0; i < array.length; i++) {
note = array[i]
track = track.addNote({
//here im converting the freq -> midi note.
midi: ftom(parseInt(note)),
time: count,
duration: 3
})
count++
}
// write the output
fs.writeFileSync('./public/sounds/' + name + random + '.mid', new Buffer.from(midi.toArray()))
I searched the internet but I couldn't find anything that can help.
I really want to have a file that the user can download with those numbers as frequencies, someone knows what can be done to get this result?
Thanks in advance for the helpers.
this function will populate a buffer with floating point values which represent the height of the raw audio curve for the given frequency
var pop_audio_buffer_custom = function (number_of_samples, given_freq, samples_per_second) {
var number_of_samples = Math.round(number_of_samples);
var audio_obj = {};
var source_buffer = new Float32Array(number_of_samples);
audio_obj.buffer = source_buffer;
var incr_theta = (2.0 * Math.PI * given_freq) / samples_per_second;
var theta = 0.0;
for (var curr_sample = 0; curr_sample < number_of_samples; curr_sample++) {
audio_obj.buffer[curr_sample] = Math.sin(theta);
console.log(audio_obj.buffer[curr_sample] , "theta ", theta);
theta += incr_theta;
}
return audio_obj;
}; // pop_audio_buffer_custom
var number_of_samples = 10000; // long enough to be audible
var given_freq = 300;
var samples_per_second = 44100; // CD quality sample rate
var wav_output_filename = "/tmp/wav_output_filename.wav"
var synthesized_obj = {};
synthesized_obj.buffer = pop_audio_buffer_custom(number_of_samples, given_freq, samples_per_second);
the world of digital audio is non trivial ... the next step once you have an audio buffer is to translate the floating point representation into something which can be stored in bytes ( typically 16 bit integers dependent on your choice of bit depth ) ... then that 16 bit integer buffer needs to get written out as a WAV file
audio is a wave sometimes called a time series ... when you pound your fist onto the table the table wobbles up and down which pushes tiny air molecules in unison with that wobble ... this wobbling of air propagates across the room and reaches a microphone diaphragm or maybe your eardrum which in turn wobbles in resonance with this wave ... if you glued a pencil onto the diaphragm so it wobbled along with the diaphragm and you slowly slid a strip of paper along the lead tip of the pencil you would see a curve being written onto that paper strip ... this is the audio curve ... an audio sample is just the height of that curve at an instant of time ... if you repeatedly wrote down this curve height value X times per second at a constant rate you will have a list of data points of raw audio ( this is what above function creates ) ... so a given audio sample is simply the value of the audio curve height at a given instant in time ... since computers are not continuous instead are discrete they cannot handle the entire pencil drawn curve so only care about this list of instantaneously measured curve height values ... those are audio samples
above 32 bit floating point buffer can be fed into following function to return a 16 bit integer buffer
var convert_32_bit_float_into_signed_16_bit_int_lossy = function(input_32_bit_buffer) {
// this method is LOSSY - intended as preliminary step when saving audio into WAV format files
// output is a byte array where the 16 bit output format
// is spread across two bytes in little endian ordering
var size_source_buffer = input_32_bit_buffer.length;
var buffer_byte_array = new Int16Array(size_source_buffer * 2); // Int8Array 8-bit twos complement signed integer
var value_16_bit_signed_int;
var index_byte = 0;
console.log("size_source_buffer", size_source_buffer);
for (var index = 0; index < size_source_buffer; index++) {
value_16_bit_signed_int = ~~((0 < input_32_bit_buffer[index]) ? input_32_bit_buffer[index] * 0x7FFF :
input_32_bit_buffer[index] * 0x8000);
buffer_byte_array[index_byte] = value_16_bit_signed_int & 0xFF; // bitwise AND operation to pluck out only the least significant byte
var byte_two_of_two = (value_16_bit_signed_int >> 8); // bit shift down to access the most significant byte
buffer_byte_array[index_byte + 1] = byte_two_of_two;
index_byte += 2;
};
// ---
return buffer_byte_array;
};
next step is to persist above 16 bit int buffer into a wav file ... I suggest you use one of the many nodejs libraries for that ( or even better write your own as its only two pages of code ;-)))

Custom filter bank is not generating the expected output

Please, refer to this article.
I have implemented the section 4.1 (Pre-processing).
The preprocessing step aims to enhance image features along a set of
chosen directions. First, image is grey-scaled and filtered with a
sharpening filter (we subtract from the image its local-mean filtered
version), thus eliminating the DC component.
We selected 12 not overlapping filters, to analyze 12 different
directions, rotated with respect to 15° each other.
GitHub Repositiry is here.
Since, the given formula in the article is incorrect, I have tried two sets of different formulas.
The first set of formula,
The second set of formula,
The expected output should be,
Neither of them are giving proper results.
Can anyone suggest me any modification?
GitHub Repository is here.
Most relevalt part of the source code is here:
public List<Bitmap> Apply(Bitmap bitmap)
{
Kernels = new List<KassWitkinKernel>();
double degrees = FilterAngle;
KassWitkinKernel kernel;
for (int i = 0; i < NoOfFilters; i++)
{
kernel = new KassWitkinKernel();
kernel.Width = KernelDimension;
kernel.Height = KernelDimension;
kernel.CenterX = (kernel.Width) / 2;
kernel.CenterY = (kernel.Height) / 2;
kernel.Du = 2;
kernel.Dv = 2;
kernel.ThetaInRadian = Tools.DegreeToRadian(degrees);
kernel.Compute();
//SleuthEye
kernel.Pad(kernel.Width, kernel.Height, WidthWithPadding, HeightWithPadding);
Kernels.Add(kernel);
degrees += degrees;
}
List<Bitmap> list = new List<Bitmap>();
Bitmap image = (Bitmap)bitmap.Clone();
//PictureBoxForm f = new PictureBoxForm(image);
//f.ShowDialog();
Complex[,] cImagePadded = ImageDataConverter.ToComplex(image);
Complex[,] fftImage = FourierTransform.ForwardFFT(cImagePadded);
foreach (KassWitkinKernel k in Kernels)
{
Complex[,] cKernelPadded = k.ToComplexPadded();
Complex[,] convolved = Convolution.ConvolveInFrequencyDomain(fftImage, cKernelPadded);
Bitmap temp = ImageDataConverter.ToBitmap(convolved);
list.Add(temp);
}
return list;
}
Perhaps the first thing that should be mentioned is that the filters should be generated with angles which should increase in FilterAngle (in your case 15 degrees) increments. This can be accomplished by modifying KassWitkinFilterBank.Apply as follow (see this commit):
public List<Bitmap> Apply(Bitmap bitmap)
{
// ...
// The generated template filter from the equations gives a line at 45 degrees.
// To get the filter to highlight lines starting with an angle of 90 degrees
// we should start with an additional 45 degrees offset.
double degrees = 45;
KassWitkinKernel kernel;
for (int i = 0; i < NoOfFilters; i++)
{
// ... setup filter (unchanged)
// Now increment the angle by FilterAngle
// (not "+= degrees" which doubles the value at each step)
degrees += FilterAngle;
}
This should give you the following result:
It is not quite the result from the paper and the differences between the images are still quite subtle, but you should be able to notice that the scratch line is most intense in the 8th figure (as would be expected since the scratch angle is approximately 100-105 degrees).
To improve the result, we should feed the filters with a pre-processed image in the same way as described in the paper:
First, image is grey-scaled and filtered with a sharpening filter (we subtract from the image its local-mean filtered version), thus eliminating the DC component
When you do so, you will get a matrix of values, some of which will be negative. As a result this intermediate processing result is not suitable to be stored as a Bitmap. As a general rule when performing image processing, you should keep all intermediate results in double or Complex as appropriate, and only convert back the final result to Bitmap for visualization.
Integrating your changes to add image sharpening from your GitHub repository while keeping intermediate results as doubles can be achieve by changing the input bitmap and temporary image variables to use double[,] datatype instead of Bitmap in the KassWitkinFilterBank.Apply method (see this commit):
public List<Bitmap> Apply(double[,] bitmap)
{
// [...]
double[,] image = (double[,])bitmap.Clone();
// [...]
}
which should give you the following result:
Or to better highlight the difference, here is figure 1 (0 degrees) on the left, next to figure 8 (105 degrees) on the right:

Traverse a graph in parallel

I'm revising for an exam (still) and have come across a question (posted below) that has me stumped. I think, in summary, the question is asking "Think of any_old_process that has to traverse a graph and do some work on the objects it finds, including adding more work.". My question is, what data structure can be parallelised to achieve the goals set out in the question?
The role of a garbage collector (GC) is to reclaim unused memory.
Tracing collectors must identify all live objects by traversing graphs
of objects induced by aggregation relationships. In brief, the GC has
some work-list of tasks to perform. It repeatedly (a) acquires a task
(e.g. an object to inspect), (b) performs the task (e.g. marks the
object unless it is already marked), and (c) generates further tasks
(e.g. adds the children of an unmarked task to the work-list). It is
desirable to parallelise this operation.
In a single-threaded
environment, the work-list is usually a single LIFO stack. What would
you have to do to make this safe for a parallel GC? Would this be a
sensible design for a parallel GC? Discuss designs of data structure
to support a parallel GC that would scale better. Explain why you
would expect them to scale better.
The natural data structure for a graph is, well, a graph, i.e. a set of graph elements (nodes) which can refer other elements. Though, for the better cache reuse, the elements can be placed/allocated in an array or arrays (generally, vectors) in order to put neighbor elements as close in memory as possible. Generally, each element or a group of elements should have a mutex (spin_mutex) to protect access to it, the contention means that some other thread is busy working on it, so no need to wait. Though, if possible, an atomic operation over the flag/state fields is preferable to mark the element as visited without a lock. For example, the simplest data structure can be the following:
struct object {
vector<object*> references;
atomic<bool> is_visited; // for simplicity, or epoch counter
// if nothing resets it to false
void inspect(); // processing method
};
vector<object> objects; // also for simplicity, if it can be for real
// things like `parallel_for` would be perfect here
Given this data structure and the way how GC work is described, it perfectly fits for a recursive parallelism like divide-and-conquer pattern:
void object::inspect() {
if( ! is_visited.exchange(true) ) {
for( object* o : objects ) // alternatively it can be `parallel_for` in some variants
cilk_spawn o->inspect(); // for Cilk or `task_group::run` for TBB or PPL
// further processing of the object
}
}
If the data structure in the question is how the tasks are organized. I'd recommend a work-stealing scheduler (like tbb or cilk. There are tons of papers on this subject. To put it simple, each worker thread has its own but shared deque of tasks, and when the deque is empty, a thread steals tasks from others deques.
The scalability comes from the property that each task can add some other tasks which can work in prarallel..
Your questions:
Think of any_old_process that has to traverse a graph and do some work on the objects it finds, including adding more work.
... what data structure can be parallelised to achieve the goals set out in the question?
Quoted questions:
Some stuff about garbage collection.
Since you are specifically interested in parallelizing graph algorithms, I'll give an example of one kind of graph traversal that can be parallelized well.
Executive Summary
Finding local minima ("basins") or maxima ("peaks") are useful operations in digital image processing. A concrete example is geological watershed analysis. One approach to the problem treats each pixel or small group of pixels in the image as a node and finds non-overlapping minimum spanning trees (MST) with the local minima as the tree roots.
Gory details
Below is a simplistic example. It's a web interview question from Palantir Technologies brought to Programming Puzzles & Code Golf by AnkitSablok. It's simplified by two assumptions (bolded below):
That a pixel/cell only has 4 neighbors instead of the usual eight.
That a cell has all uphill neighbors (it's the local minima) or has a unique downhill neighbor. I.e., plains aren't allowed.
Below that is some JavaScript that solves this problem. It violates every reasonable coding standard against use of side-effects, but illustrates where some of the opportunities for parallelization exist.
In the "Create list of sinks (i.e. roots)" loop, note that each cell can be evaluated completely independently for elevation with respect to it's neighbors as long as the elevation data is static. In a sequential program, one thread of execution examines each cell. In a parallel program, the cells are divvied up so that one, and only one, thread reads and writes the local minima state information (sink[] in the program below). If generating the list of minima/roots in parallel, the queuing operations for the stack would have to be synchronized. For a discussion how to do that for stacks and other queues, see "Simple, Fast, and Practical Non-Blocking and Blocking Concurrent Queue Algorithms", Michael & Scott, 1996. For modern updates, follow the citation tree on Google Scholar (no mutex required :).
In the "Each root explores it's basin" loop, note that each basin could explored/enumerated/flooded in parallel.
If you want dive deeper into parallelizing MSTs, see "Scalable Parallel Minimum Spanning Forest Computation", Nobari, Cao, arras, Bressan, 2012. The first two pages contain a clear and concise survey of the field.
Simplified example
A group of farmers has some elevation data, and we’re going to help them understand how rainfall flows over their farmland. We’ll represent the land as a two-dimensional array of altitudes and use the following model, based on the idea that water flows downhill:
If a cell’s four neighboring cells all have higher altitudes, we call this cell a sink; water collects in sinks. Otherwise, water will flow to the neighboring cell with the lowest altitude. If a cell is not a sink, you may assume it has a unique lowest neighbor and that this neighbor will be lower than the cell.
Cells that drain into the same sink – directly or indirectly – are said to be part of the same basin.
Your challenge is to partition the map into basins. In particular, given a map of elevations, your code should partition the map into basins and output the sizes of the basins, in descending order.
Assume the elevation maps are square. Input will begin with a line with one integer, S, the height (and width) of the map. The next S lines will each contain a row of the map, each with S integers – the elevations of the S cells in the row. Some farmers have small land plots such as the examples below, while some have larger plots. However, in no case will a farmer have a plot of land larger than S = 5000.
Your code should output a space-separated list of the basin sizes, in descending order. (Trailing spaces are ignored.)
Here's an example:
Input:
5
1 0 2 5 8
2 3 4 7 9
3 5 7 8 9
1 2 5 4 2
3 3 5 2 1
Output: 11 7 7
The basins, labeled with A’s, B’s, and C’s, are:
A A A A A
A A A A A
B B A C C
B B B C C
B B C C C
// lm.js - find the local minima
// Globalization of variables.
/*
The map is a 2 dimensional array. Indices for the elements map as:
[0,0] ... [0,n]
...
[n,0] ... [n,n]
Each element of the array is a structure. The structure for each element is:
Item Purpose Range Comment
---- ------- ----- -------
h Height of cell integers
s Is it a sink? boolean
x X of downhill cell (0..maxIndex) if s is true, x&y point to self
y Y of downhill cell (0..maxIndex)
b Basin name ('A'..'A'+# of basins)
Use a separate array-of-arrays for each structure item. The index range is
0..maxIndex.
*/
var height = [];
var sink = [];
var downhillX = [];
var downhillY = [];
var basin = [];
var maxIndex;
// A list of sinks in the map. Each element is an array of [ x, y ], where
// both x & y are in the range 0..maxIndex.
var basinList = [];
// An unordered list of basin sizes.
var basinSize = [];
// Functions.
function isSink(x,y) {
var myHeight = height[x][y];
var imaSink = true;
var bestDownhillHeight = myHeight;
var bestDownhillX = x;
var bestDownhillY = y;
/*
Visit the neighbors. If this cell is the lowest, then it's the
sink. If not, find the steepest downhill direction.
*/
function visit(deltaX,deltaY) {
var neighborX = x+deltaX;
var neighborY = y+deltaY;
if (myHeight > height[neighborX][neighborY]) {
imaSink = false;
if (bestDownhillHeight > height[neighborX][neighborY]) {
bestDownhillHeight = height[neighborX][neighborY];
bestDownhillX = neighborX;
bestDownhillY = neighborY;
}
}
}
if (x !== 0) {
// upwards neighbor exists
visit(-1,0);
}
if (x !== maxIndex) {
// downwards neighbor exists
visit(1,0);
}
if (y !== 0) {
// left-hand neighbor exists
visit(0,-1);
}
if (y !== maxIndex) {
// right-hand neighbor exists
visit(0,1);
}
downhillX[x][y] = bestDownhillX;
downhillY[x][y] = bestDownhillY;
return imaSink;
}
function exploreBasin(x,y,currentSize,basinName) {
// This cell is in the basin.
basin[x][y] = basinName;
currentSize++;
/*
Visit all neighbors that have this cell as the best downhill
path and add them to the basin.
*/
function visit(x,deltaX,y,deltaY) {
if ((downhillX[x+deltaX][y+deltaY] === x) && (downhillY[x+deltaX][y+deltaY] === y)) {
currentSize = exploreBasin(x+deltaX,y+deltaY,currentSize,basinName);
}
return 0;
}
if (x !== 0) {
// upwards neighbor exists
visit(x,-1,y,0);
}
if (x !== maxIndex) {
// downwards neighbor exists
visit(x,1,y,0);
}
if (y !== 0) {
// left-hand neighbor exists
visit(x,0,y,-1);
}
if (y !== maxIndex) {
// right-hand neighbor exists
visit(x,0,y,1);
}
return currentSize;
}
// Read map from file (1st argument).
var lines = $EXEC('cat "' + $ARG[0] + '"').split('\n');
maxIndex = lines.shift() - 1;
for (var i = 0; i<=maxIndex; i++) {
height[i] = lines.shift().split(' ');
// Create all other 2D arrays.
sink[i] = [];
downhillX[i] = [];
downhillY[i] = [];
basin[i] = [];
}
for (var i = 0; i<=maxIndex; i++) { print(height[i]); }
// Everyone decides if they are a sink. Create list of sinks (i.e. roots).
for (var x=0; x<=maxIndex; x++) {
for (var y=0; y<=maxIndex; y++) a
if (sink[x][y] = isSink(x,y)) {
// This node is a root (AKA sink).
basinList.push([x,y]);
}
}
}
//for (var i = 0; i<=maxIndex; i++) { print(sink[i]); }
// Each root explores it's basin.
var basinName = 'A';
for (var i=basinList.length-1; i>=0; --i) { // i-- makes Closure Compiler sad
var x = basinList[i][0];
var y = basinList[i][5];
basinSize.push(exploreBasin(x,y,0,basinName));
basinName = String.fromCharCode(basinName.charCodeAt() + 1);
}
for (var i = 0; i<=maxIndex; i++) { print(basin[i]); }
// Done.
print(basinSize.sort(function(a, b){return b-a}).join(' '));

Bayes' formula for updating probabilistic map

I'm trying to get a mobile robot to map an arena based on what it can see from a camera. I've created a map, and managed to get the robot to identify items placed in the arena and give an estimated location, however, as I'm only using an RGB camera the resulting numbers can vary slightly ever frame due to noise, or change in lighting, etc. What am now trying to do is create a probability map using Bayes' formula to give a better map of the arena.
Bayes' Formula
P(i | x) = (p(i)p(x|i))/(sum(p(j)(p(x|j))
This is what I've got so far. All points on the map are initialised to 0.5.
// Gets the Likely hood of the event being correct
// Para 1 = Is the object likely to be at that location
// Para 2 = is the sensor saying it's at that location
private double getProbabilityNum(bool world, bool sensor)
{
if (world && sensor)
{
// number to test the function works
return 0.6;
}
else if (world && !sensor)
{
// number to test the function works
return 0.4;
}
else if (!world && sensor)
{
// number to test the function works
return 0.2;
}
else //if (!world && !sensor)
{
// number to test the function works
return 0.8;
}
}
// A function to update the map's probability of an object being at location (x,y)
// Para 3 = does the sensor pick up the an object at (x,y)
public double probabilisticMap(int x,int y,bool sensor)
{
// gets the current likelihood from the map (prior Probability)
double mapProb = get(x,y);
//decide if object is at location (x,y)
bool world = (mapProb < threshold);
//Bayes' formula to update the probability
double newProb =
(getProbabilityNum(world, sensor) * mapProb) / ((getProbabilityNum(world, sensor) * mapProb) + (getProbabilityNum(!world, sensor) * (1 - mapProb)));
// update the location on the map
set(x,y,newProb);
// return the probability as well
return newProb;
}
It does work, but the numbers seem to jump rapidly, and then flicker when they are at the top, it also errors if the numbers drop too near to zero. Anyone have any idea why this might be happening? I think it's something to do with the way the equations is coded, but I'm not too sure. (I found this, but I don't quite understand it, so I'm not sure of it's relevents, but it seems to be talking about the same thing
Thanks in Advance.
Use log-likelihoods when doing numerical computations involving probabilities.
Consider
P(i | x) = (p(i)p(x|i))/(sum(p(j)(p(x|j)).
Because x is fixed, the denominator, p(x), is a constant. Thus
P(i | x) ~ p(i)p(x|i)
where ~ denotes "is proportional to."
The log-likelihood function is just the log of this. That is,
L(i | x) = log(p(i)) + log(p(x|i)).

How to compute the visible area based on a heightmap?

I have a heightmap. I want to efficiently compute which tiles in it are visible from an eye at any given location and height.
This paper suggests that heightmaps outperform turning the terrain into some kind of mesh, but they sample the grid using Bresenhams.
If I were to adopt that, I'd have to do a line-of-sight Bresenham's line for each and every tile on the map. It occurs to me that it ought to be possible to reuse most of the calculations and compute the heightmap in a single pass if you fill outwards away from the eye - a scanline fill kind of approach perhaps?
But the logic escapes me. What would the logic be?
Here is a heightmap with a the visibility from a particular vantagepoint (green cube) ("viewshed" as in "watershed"?) painted over it:
Here is the O(n) sweep that I came up with; I seems the same as that given in the paper in the answer below How to compute the visible area based on a heightmap? Franklin and Ray's method, only in this case I am walking from eye outwards instead of walking the perimeter doing a bresenhams towards the centre; to my mind, my approach would have much better caching behaviour - i.e. be faster - and use less memory since it doesn't have to track the vector for each tile, only remember a scanline's worth:
typedef std::vector<float> visbuf_t;
inline void map::_visibility_scan(const visbuf_t& in,visbuf_t& out,const vec_t& eye,int start_x,int stop_x,int y,int prev_y) {
const int xdir = (start_x < stop_x)? 1: -1;
for(int x=start_x; x!=stop_x; x+=xdir) {
const int x_diff = abs(eye.x-x), y_diff = abs(eye.z-y);
const bool horiz = (x_diff >= y_diff);
const int x_step = horiz? 1: x_diff/y_diff;
const int in_x = x-x_step*xdir; // where in the in buffer would we get the inner value?
const float outer_d = vec2_t(x,y).distance(vec2_t(eye.x,eye.z));
const float inner_d = vec2_t(in_x,horiz? y: prev_y).distance(vec2_t(eye.x,eye.z));
const float inner = (horiz? out: in).at(in_x)*(outer_d/inner_d); // get the inner value, scaling by distance
const float outer = height_at(x,y)-eye.y; // height we are at right now in the map, eye-relative
if(inner <= outer) {
out.at(x) = outer;
vis.at(y*width+x) = VISIBLE;
} else {
out.at(x) = inner;
vis.at(y*width+x) = NOT_VISIBLE;
}
}
}
void map::visibility_add(const vec_t& eye) {
const float BASE = -10000; // represents a downward vector that would always be visible
visbuf_t scan_0, scan_out, scan_in;
scan_0.resize(width);
vis[eye.z*width+eye.x-1] = vis[eye.z*width+eye.x] = vis[eye.z*width+eye.x+1] = VISIBLE;
scan_0.at(eye.x) = BASE;
scan_0.at(eye.x-1) = BASE;
scan_0.at(eye.x+1) = BASE;
_visibility_scan(scan_0,scan_0,eye,eye.x+2,width,eye.z,eye.z);
_visibility_scan(scan_0,scan_0,eye,eye.x-2,-1,eye.z,eye.z);
scan_out = scan_0;
for(int y=eye.z+1; y<height; y++) {
scan_in = scan_out;
_visibility_scan(scan_in,scan_out,eye,eye.x,-1,y,y-1);
_visibility_scan(scan_in,scan_out,eye,eye.x,width,y,y-1);
}
scan_out = scan_0;
for(int y=eye.z-1; y>=0; y--) {
scan_in = scan_out;
_visibility_scan(scan_in,scan_out,eye,eye.x,-1,y,y+1);
_visibility_scan(scan_in,scan_out,eye,eye.x,width,y,y+1);
}
}
Is it a valid approach?
it is using centre-points rather than looking at the slope between the 'inner' pixel and its neighbour on the side that the LoS passes
could the trig in to scale the vectors and such be replaced by factor multiplication?
it could use an array of bytes since the heights are themselves bytes
its not a radial sweep, its doing a whole scanline at a time but away from the point; it only uses only a couple of scanlines-worth of additional memory which is neat
if it works, you could imagine that you could distribute it nicely using a radial sweep of blocks; you have to compute the centre-most tile first, but then you can distribute all immediately adjacent tiles from that (they just need to be given the edge-most intermediate values) and then in turn more and more parallelism.
So how to most efficiently calculate this viewshed?
What you want is called a sweep algorithm. Basically you cast rays (Bresenham's) to each of the perimeter cells, but keep track of the horizon as you go and mark any cells you pass on the way as being visible or invisible (and update the ray's horizon if visible). This gets you down from the O(n^3) of the naive approach (testing each cell of an nxn DEM individually) to O(n^2).
More detailed description of the algorithm in section 5.1 of this paper (which you might also find interesting for other reasons if you aspire to work with really enormous heightmaps).

Resources