Creating a Matrix in MathDotNet given a 2D array - math.net

I have the following code got from this site as a download from here. The full path after the zip file is extracted would be some thing like E:\Csharp2Dand3DTestbed\GraphicsBook\LA\LA\MatrixTransform2.cs
protected static double[,] MatrixInverse(double[,] mat)
{
Matrix m = new Matrix(mat);
Matrix k = m.Inverse();
return k;
}
But that does not compile. I see from here that I need to do something like
protected static double[,] MatrixInverse(double[,] mat)
{
Matrix<double> m = Matrix<double>.Build.WhatHere(???)(3, 4); // How with existing matrix
Matrix k = m.Inverse();
return k;
}
Can someone please guide me. I am not able to go further. I am using the latest version of the Math.NET Numerics

From double[,] to matrix (two options):
var matrix = Matrix<double>.Build.DenseOfArray(array);
var matrix = CreateMatrix.DenseOfArray(array);
From matrix to double[,]:
var array = matrix.ToArray();
Note that these involve a full copy since matrices do not use 2D arrays internally. There are more examples on this in the documentation.

Related

PyCuda program keeps on running

answer_array = np.zeros_like(self.redarray)
answer_array_gpu = cuda.mem_alloc(answer_array.nbytes)
redarray_gpu = cuda.mem_alloc(self.redcont.nbytes)
greenarray_gpu = cuda.mem_alloc(self.greencont.nbytes)
bluearray_gpu = cuda.mem_alloc(self.bluecont.nbytes)
cuda.memcpy_htod(redarray_gpu, self.redcont)
cuda.memcpy_htod(greenarray_gpu, self.greencont)
cuda.memcpy_htod(bluearray_gpu, self.bluecont)
cuda.memcpy_htod(answer_array_gpu, answer_array)
desaturate_mod = SourceModule("""
__global__ void array_desaturation(float *a, float *b, float *c, float *d){
int index = blockIdx.x * blockDim.x + threadIdx.x;
d[index] = ((a[index] + b[index] + c[index])/3);
}
""")
func = desaturate_mod.get_function("array_desaturation")
func(redarray_gpu, greenarray_gpu, bluearray_gpu, answer_array_gpu,
block=(self.gpu_threads, self.gpu_threads, self.blocks_to_use))
desaturated = np.empty_like(self.redarray)
cuda.memcpy_dtoh(desaturated, answer_array_gpu)
print(desaturated)
print("Up to here")
I wrote this piece of code for finding the average of values on three arrays and save it in to a fourth array. The code is neither printing the result, nor the line saying "Up to here". What could be the error?
Additional info: Redarray, greenarray and bluearray are float32 numpy arrays
I know getting started with arrays in C, and especially in PyCUDA can be pretty tricky, it took me months to get a 2D sliding max algorithm working.
In this example, you can't access array elements like you can in Python where you can just provide an index since you are passing a pointer to the memory address to the first element in each array. A useful example for how this works in C can be found here. You will also have to pass in the length for the arrays (assuming they are all equal so that we do not go out of bounds) and if they are of different lengths, all of them respectively.
Hopefully, you can understand how to access your array elements via pointers in C from that link. Then #talonmies provides an nice example here for how to pass in a 2D array (this is the same as a 1D array since the 2D array gets flattened to a 1D one in memory on the GPU). However, when I was working with this, I never passed in the strides which #talonmies does, working like the TutorialsPoint tutorial says *(pointer_to_array + index) is correct. Providing a memory stride here will cause you to go out of bounds.
Therefore my code for this would look more like:
C_Code = """
__global__ void array_desaturation(float *array_A, float *array_B, float *array_C, float *outputArray, int arrayLengths){
int index = blockIdx.x * blockDim.x + threadIdx.x;
if(index >= arrayLengths){ // In case our threads created would be outwise out of the bounds for our memory, if we did we would have some serious unpredictable problems
return;
}
// These variables will get the correct values from the arrays at the appropriate index relative to their unique memory addresses (You could leave this part out but I like the legibility)
float aValue = *(array_A + index);
float bValue = *(array_B + index);
float cValue = *(array_C + index);
*(outputArray + index) = ((aValue + bValue + cValue)/3); //Set the (output arrays's pointer + offset)'s value to our average value
}"""
desaturate_mod = SourceModule(C_Code)
desaturate_kernel = desaturate_mod.get_function("array_desaturation")
desaturate_kernel(cuda.In(array_A), # Input
cuda.In(array_B), # Input
cuda.In(array_C), # Input
cuda.Out(outputArray), # Output
numpy.int32(len(array_A)), # Array Size if they are all the same length
block=(blockSize[0],blockSize[1],1), # However you want for the next to parameters but change your index accordingly
grid=(gridSize[0],gridSize[1],1)
)
print(outputArray) # Done! Make sure you have defined all these arrays before ofc

Custom filter bank is not generating the expected output

Please, refer to this article.
I have implemented the section 4.1 (Pre-processing).
The preprocessing step aims to enhance image features along a set of
chosen directions. First, image is grey-scaled and filtered with a
sharpening filter (we subtract from the image its local-mean filtered
version), thus eliminating the DC component.
We selected 12 not overlapping filters, to analyze 12 different
directions, rotated with respect to 15° each other.
GitHub Repositiry is here.
Since, the given formula in the article is incorrect, I have tried two sets of different formulas.
The first set of formula,
The second set of formula,
The expected output should be,
Neither of them are giving proper results.
Can anyone suggest me any modification?
GitHub Repository is here.
Most relevalt part of the source code is here:
public List<Bitmap> Apply(Bitmap bitmap)
{
Kernels = new List<KassWitkinKernel>();
double degrees = FilterAngle;
KassWitkinKernel kernel;
for (int i = 0; i < NoOfFilters; i++)
{
kernel = new KassWitkinKernel();
kernel.Width = KernelDimension;
kernel.Height = KernelDimension;
kernel.CenterX = (kernel.Width) / 2;
kernel.CenterY = (kernel.Height) / 2;
kernel.Du = 2;
kernel.Dv = 2;
kernel.ThetaInRadian = Tools.DegreeToRadian(degrees);
kernel.Compute();
//SleuthEye
kernel.Pad(kernel.Width, kernel.Height, WidthWithPadding, HeightWithPadding);
Kernels.Add(kernel);
degrees += degrees;
}
List<Bitmap> list = new List<Bitmap>();
Bitmap image = (Bitmap)bitmap.Clone();
//PictureBoxForm f = new PictureBoxForm(image);
//f.ShowDialog();
Complex[,] cImagePadded = ImageDataConverter.ToComplex(image);
Complex[,] fftImage = FourierTransform.ForwardFFT(cImagePadded);
foreach (KassWitkinKernel k in Kernels)
{
Complex[,] cKernelPadded = k.ToComplexPadded();
Complex[,] convolved = Convolution.ConvolveInFrequencyDomain(fftImage, cKernelPadded);
Bitmap temp = ImageDataConverter.ToBitmap(convolved);
list.Add(temp);
}
return list;
}
Perhaps the first thing that should be mentioned is that the filters should be generated with angles which should increase in FilterAngle (in your case 15 degrees) increments. This can be accomplished by modifying KassWitkinFilterBank.Apply as follow (see this commit):
public List<Bitmap> Apply(Bitmap bitmap)
{
// ...
// The generated template filter from the equations gives a line at 45 degrees.
// To get the filter to highlight lines starting with an angle of 90 degrees
// we should start with an additional 45 degrees offset.
double degrees = 45;
KassWitkinKernel kernel;
for (int i = 0; i < NoOfFilters; i++)
{
// ... setup filter (unchanged)
// Now increment the angle by FilterAngle
// (not "+= degrees" which doubles the value at each step)
degrees += FilterAngle;
}
This should give you the following result:
It is not quite the result from the paper and the differences between the images are still quite subtle, but you should be able to notice that the scratch line is most intense in the 8th figure (as would be expected since the scratch angle is approximately 100-105 degrees).
To improve the result, we should feed the filters with a pre-processed image in the same way as described in the paper:
First, image is grey-scaled and filtered with a sharpening filter (we subtract from the image its local-mean filtered version), thus eliminating the DC component
When you do so, you will get a matrix of values, some of which will be negative. As a result this intermediate processing result is not suitable to be stored as a Bitmap. As a general rule when performing image processing, you should keep all intermediate results in double or Complex as appropriate, and only convert back the final result to Bitmap for visualization.
Integrating your changes to add image sharpening from your GitHub repository while keeping intermediate results as doubles can be achieve by changing the input bitmap and temporary image variables to use double[,] datatype instead of Bitmap in the KassWitkinFilterBank.Apply method (see this commit):
public List<Bitmap> Apply(double[,] bitmap)
{
// [...]
double[,] image = (double[,])bitmap.Clone();
// [...]
}
which should give you the following result:
Or to better highlight the difference, here is figure 1 (0 degrees) on the left, next to figure 8 (105 degrees) on the right:

Loss of data during the Inverse-FFT of an Image

I am using the following code to convert a Bitmap to Complex and vice versa.
Even though those were directly copied from Accord.NET framework, while testing these static methods, I have discovered that, repeated use of these static methods cause 'data-loss'. As a result, the end output/result becomes distorted.
public partial class ImageDataConverter
{
#region private static Complex[,] FromBitmapData(BitmapData bmpData)
private static Complex[,] ToComplex(BitmapData bmpData)
{
Complex[,] comp = null;
if (bmpData.PixelFormat == PixelFormat.Format8bppIndexed)
{
int width = bmpData.Width;
int height = bmpData.Height;
int offset = bmpData.Stride - (width * 1);//1 === 1 byte per pixel.
if ((!Tools.IsPowerOf2(width)) || (!Tools.IsPowerOf2(height)))
{
throw new Exception("Imager width and height should be n of 2.");
}
comp = new Complex[width, height];
unsafe
{
byte* src = (byte*)bmpData.Scan0.ToPointer();
for (int y = 0; y < height; y++)
{
for (int x = 0; x < width; x++, src++)
{
comp[y, x] = new Complex((float)*src / 255,
comp[y, x].Imaginary);
}
src += offset;
}
}
}
else
{
throw new Exception("EightBppIndexedImageRequired");
}
return comp;
}
#endregion
public static Complex[,] ToComplex(Bitmap bmp)
{
Complex[,] comp = null;
if (bmp.PixelFormat == PixelFormat.Format8bppIndexed)
{
BitmapData bmpData = bmp.LockBits( new Rectangle(0, 0, bmp.Width, bmp.Height),
ImageLockMode.ReadOnly,
PixelFormat.Format8bppIndexed);
try
{
comp = ToComplex(bmpData);
}
finally
{
bmp.UnlockBits(bmpData);
}
}
else
{
throw new Exception("EightBppIndexedImageRequired");
}
return comp;
}
public static Bitmap ToBitmap(Complex[,] image, bool fourierTransformed)
{
int width = image.GetLength(0);
int height = image.GetLength(1);
Bitmap bmp = Imager.CreateGrayscaleImage(width, height);
BitmapData bmpData = bmp.LockBits(
new Rectangle(0, 0, width, height),
ImageLockMode.ReadWrite,
PixelFormat.Format8bppIndexed);
int offset = bmpData.Stride - width;
double scale = (fourierTransformed) ? Math.Sqrt(width * height) : 1;
unsafe
{
byte* address = (byte*)bmpData.Scan0.ToPointer();
for (int y = 0; y < height; y++)
{
for (int x = 0; x < width; x++, address++)
{
double min = System.Math.Min(255, image[y, x].Magnitude * scale * 255);
*address = (byte)System.Math.Max(0, min);
}
address += offset;
}
}
bmp.UnlockBits(bmpData);
return bmp;
}
}
(The DotNetFiddle link of the complete source code)
(ImageDataConverter)
Output:
As you can see, FFT is working correctly, but, I-FFT isn't.
That is because bitmap to complex and vice versa isn't working as expected.
What could be done to correct the ToComplex() and ToBitmap() functions so that they don't loss data?
I do not code in C# so handle this answer with extreme prejudice!
Just from a quick look I spotted few problems:
ToComplex()
Is converting BMP into 2D complex matrix. When you are converting you are leaving imaginary part unchanged, but at the start of the same function you have:
Complex[,] complex2D = null;
complex2D = new Complex[width, height];
So the imaginary parts are either undefined or zero depends on your complex class constructor. This means you are missing half of the data needed for reconstruction !!! You should restore the original complex matrix from 2 images one for real and second for imaginary part of the result.
ToBitmap()
You are saving magnitude which is I think sqrt( Re*Re + Im*Im ) so it is power spectrum not the original complex values and so you can not reconstruct back... You should store Re,Im in 2 separate images.
8bit per pixel
That is not much and can cause significant round off errors after FFT/IFFT so reconstruction can be really distorted.
[Edit1] Remedy
There are more options to repair this for example:
use floating complex matrix for computations and bitmap only for visualization.
This is the safest way because you avoid additional conversion round offs. This approach has the best precision. But you need to rewrite your DIP/CV algorithms to support complex domain matrices instead of bitmaps which require not small amount of work.
rewrite your conversions to support real and imaginary part images
Your conversion is really bad as it does not store/restore Real and Imaginary parts as it should and also it does not account for negative values (at least I do not see it instead they are cut down to zero which is WRONG). I would rewrite the conversion to this:
// conversion scales
float Re_ofset=256.0,Re_scale=512.0/255.0;
float Im_ofset=256.0,Im_scale=512.0/255.0;
private static Complex[,] ToComplex(BitmapData bmpRe,BitmapData bmpIm)
{
//...
byte* srcRe = (byte*)bmpRe.Scan0.ToPointer();
byte* srcIm = (byte*)bmpIm.Scan0.ToPointer();
complex c = new Complex(0.0,0.0);
// for each line
for (int y = 0; y < height; y++)
{
// for each pixel
for (int x = 0; x < width; x++, src++)
{
complex2D[y, x] = c;
c.Real = (float)*(srcRe*Re_scale)-Re_ofset;
c.Imaginary = (float)*(srcIm*Im_scale)-Im_ofset;
}
src += offset;
}
//...
}
public static Bitmap ToBitmapRe(Complex[,] complex2D)
{
//...
float Re = (complex2D[y, x].Real+Re_ofset)/Re_scale;
Re = min(Re,255.0);
Re = max(Re, 0.0);
*address = (byte)Re;
//...
}
public static Bitmap ToBitmapIm(Complex[,] complex2D)
{
//...
float Im = (complex2D[y, x].Imaginary+Im_ofset)/Im_scale;
Re = min(Im,255.0);
Re = max(Im, 0.0);
*address = (byte)Im;
//...
}
Where:
Re_ofset = min(complex2D[,].Real);
Im_ofset = min(complex2D[,].Imaginary);
Re_scale = (max(complex2D[,].Real )-min(complex2D[,].Real ))/255.0;
Im_scale = (max(complex2D[,].Imaginary)-min(complex2D[,].Imaginary))/255.0;
or cover bigger interval then the complex matrix values.
You can also encode both Real and Imaginary parts to single image for example first half of image could be Real and next the Imaginary part. In that case you do not need to change the function headers nor names at all .. but you would need to handle the images as 2 joined squares each with different meaning ...
You can also use RGB images where R = Real, B = Imaginary or any other encoding that suites you.
[Edit2] some examples to make my points more clear
example of approach #1
The image is in form of floating point 2D complex matrix and the images are created only for visualization. There is little rounding error this way. The values are not normalized so the range is <0.0,255.0> per pixel/cell at first but after transforms and scaling it could change greatly.
As you can see I added scaling so all pixels are multiplied by 315 to actually see anything because the FFT output values are small except of few cells. But only for visualization the complex matrix is unchanged.
example of approach #2
Well as I mentioned before you do not handle negative values, normalize values to range <0,1> and back by scaling and rounding off and using only 8 bits per pixel to store the sub results. I tried to simulate that with my code and here is what I got (using complex domain instead of wrongly used power spectrum like you did). Here C++ source only as an template example as you do not have the functions and classes behind it:
transform t;
cplx_2D c;
rgb2i(bmp0);
c.ld(bmp0,bmp0);
null_im(c);
c.mul(1.0/255.0);
c.mul(255.0); c.st(bmp0,bmp1); c.ld(bmp0,bmp1); i2iii(bmp0); i2iii(bmp1); c.mul(1.0/255.0);
bmp0->SaveToFile("_out0_Re.bmp");
bmp1->SaveToFile("_out0_Im.bmp");
t. DFFT(c,c);
c.wrap();
c.mul(255.0); c.st(bmp0,bmp1); c.ld(bmp0,bmp1); i2iii(bmp0); i2iii(bmp1); c.mul(1.0/255.0);
bmp0->SaveToFile("_out1_Re.bmp");
bmp1->SaveToFile("_out1_Im.bmp");
c.wrap();
t.iDFFT(c,c);
c.mul(255.0); c.st(bmp0,bmp1); c.ld(bmp0,bmp1); i2iii(bmp0); i2iii(bmp1); c.mul(1.0/255.0);
bmp0->SaveToFile("_out2_Re.bmp");
bmp1->SaveToFile("_out2_Im.bmp");
And here the sub results:
As you can see after the DFFT and wrap the image is really dark and most of the values are rounded off. So the result after unwrap and IDFFT is really pure.
Here some explanations to code:
c.st(bmpre,bmpim) is the same as your ToBitmap
c.ld(bmpre,bmpim) is the same as your ToComplex
c.mul(scale) multiplies complex matrix c by scale
rgb2i converts RGB to grayscale intensity <0,255>
i2iii converts grayscale intensity ro grayscale RGB image
I'm not really good in this puzzles but double check this dividing.
comp[y, x] = new Complex((float)*src / 255, comp[y, x].Imaginary);
You can loose precision as it is described here
Complex class definition in Remarks section.
May be this happens in your case.
Hope this helps.

How can i store and access images in Mat of opencv

I am trying to use:
cv::Mat source;
const int histSize[] = {intialframes, initialWidth, initialHeight};
source.create(3, histSize, CV_8U);
for saving multiple images in one matrix. However when i do so, it gives me dims = 3 and -1 in rows and cols.
Is it correct?
If not what is the bug in it?
if yes how can I access my images one by one?
Reading the documentation of the class cv::Mat ->doc
You can see that cv::Mat.rows and cv::Mat.cols are the number of rows and cols in a 2D array -1 otherwise.
With source.create(3, histSize, CV_8U); you are creating a 3D array.
In the cv::Mat doc is written how to access the elements.
With the create method the matrix is continuos and in a plane-by-plane organized fashion.
EDIT
The first part of text in the documentation after the code of the class definition tells you how to access each element of the matrix using the step[] parameter of the matrix:
If you want to access the pixel (u, v) of the image i you need to get a pointer to the data and use pointer's arithmetic to reach the desired pixel:
int sizes[] = { 10, 200, 100 };
cv::Mat M(3, sizes, CV_8UC1);
//get a pointer to the pixel
uchar *px = M.data + M.step[0] * i + M.step[1] * u + M.step[2] * v;
//get the pixel intensity
uchar intensity = *px;

Is there a circular hash function?

Thinking about this question on testing string rotation, I wondered: Is there was such thing as a circular/cyclic hash function? E.g.
h(abcdef) = h(bcdefa) = h(cdefab) etc
Uses for this include scalable algorithms which can check n strings against each other to see where some are rotations of others.
I suppose the essence of the hash is to extract information which is order-specific but not position-specific. Maybe something that finds a deterministic 'first position', rotates to it and hashes the result?
It all seems plausible, but slightly beyond my grasp at the moment; it must be out there already...
I'd go along with your deterministic "first position" - find the "least" character; if it appears twice, use the next character as the tie breaker (etc). You can then rotate to a "canonical" position, and hash that in a normal way. If the tie breakers run for the entire course of the string, then you've got a string which is a rotation of itself (if you see what I mean) and it doesn't matter which you pick to be "first".
So:
"abcdef" => hash("abcdef")
"defabc" => hash("abcdef")
"abaac" => hash("aacab") (tie-break between aa, ac and ab)
"cabcab" => hash("abcabc") (it doesn't matter which "a" comes first!)
Update: As Jon pointed out, the first approach doesn't handle strings with repetition very well. Problems arise as duplicate pairs of letters are encountered and the resulting XOR is 0. Here is a modification that I believe fixes the the original algorithm. It uses Euclid-Fermat sequences to generate pairwise coprime integers for each additional occurrence of a character in the string. The result is that the XOR for duplicate pairs is non-zero.
I've also cleaned up the algorithm slightly. Note that the array containing the EF sequences only supports characters in the range 0x00 to 0xFF. This was just a cheap way to demonstrate the algorithm. Also, the algorithm still has runtime O(n) where n is the length of the string.
static int Hash(string s)
{
int H = 0;
if (s.Length > 0)
{
//any arbitrary coprime numbers
int a = s.Length, b = s.Length + 1;
//an array of Euclid-Fermat sequences to generate additional coprimes for each duplicate character occurrence
int[] c = new int[0xFF];
for (int i = 1; i < c.Length; i++)
{
c[i] = i + 1;
}
Func<char, int> NextCoprime = (x) => c[x] = (c[x] - x) * c[x] + x;
Func<char, char, int> NextPair = (x, y) => a * NextCoprime(x) * x.GetHashCode() + b * y.GetHashCode();
//for i=0 we need to wrap around to the last character
H = NextPair(s[s.Length - 1], s[0]);
//for i=1...n we use the previous character
for (int i = 1; i < s.Length; i++)
{
H ^= NextPair(s[i - 1], s[i]);
}
}
return H;
}
static void Main(string[] args)
{
Console.WriteLine("{0:X8}", Hash("abcdef"));
Console.WriteLine("{0:X8}", Hash("bcdefa"));
Console.WriteLine("{0:X8}", Hash("cdefab"));
Console.WriteLine("{0:X8}", Hash("cdfeab"));
Console.WriteLine("{0:X8}", Hash("a0a0"));
Console.WriteLine("{0:X8}", Hash("1010"));
Console.WriteLine("{0:X8}", Hash("0abc0def0ghi"));
Console.WriteLine("{0:X8}", Hash("0def0abc0ghi"));
}
The output is now:
7F7D7F7F
7F7D7F7F
7F7D7F7F
7F417F4F
C796C7F0
E090E0F0
A909BB71
A959BB71
First Version (which isn't complete): Use XOR which is commutative (order doesn't matter) and another little trick involving coprimes to combine ordered hashes of pairs of letters in the string. Here is an example in C#:
static int Hash(char[] s)
{
//any arbitrary coprime numbers
const int a = 7, b = 13;
int H = 0;
if (s.Length > 0)
{
//for i=0 we need to wrap around to the last character
H ^= (a * s[s.Length - 1].GetHashCode()) + (b * s[0].GetHashCode());
//for i=1...n we use the previous character
for (int i = 1; i < s.Length; i++)
{
H ^= (a * s[i - 1].GetHashCode()) + (b * s[i].GetHashCode());
}
}
return H;
}
static void Main(string[] args)
{
Console.WriteLine(Hash("abcdef".ToCharArray()));
Console.WriteLine(Hash("bcdefa".ToCharArray()));
Console.WriteLine(Hash("cdefab".ToCharArray()));
Console.WriteLine(Hash("cdfeab".ToCharArray()));
}
The output is:
4587590
4587590
4587590
7077996
You could find a deterministic first position by always starting at the position with the "lowest" (in terms of alphabetical ordering) substring. So in your case, you'd always start at "a". If there were multiple "a"s, you'd have to take two characters into account etc.
I am sure that you could find a function that can generate the same hash regardless of character position in the input, however, how will you ensure that h(abc) != h(efg) for every conceivable input? (Collisions will occur for all hash algorithms, so I mean, how do you minimize this risk.)
You'd need some additional checks even after generating the hash to ensure that the strings contain the same characters.
Here's an implementation using Linq
public string ToCanonicalOrder(string input)
{
char first = input.OrderBy(x => x).First();
string doubledForRotation = input + input;
string canonicalOrder
= (-1)
.GenerateFrom(x => doubledForRotation.IndexOf(first, x + 1))
.Skip(1) // the -1
.TakeWhile(x => x < input.Length)
.Select(x => doubledForRotation.Substring(x, input.Length))
.OrderBy(x => x)
.First();
return canonicalOrder;
}
assuming generic generator extension method:
public static class TExtensions
{
public static IEnumerable<T> GenerateFrom<T>(this T initial, Func<T, T> next)
{
var current = initial;
while (true)
{
yield return current;
current = next(current);
}
}
}
sample usage:
var sequences = new[]
{
"abcdef", "bcdefa", "cdefab",
"defabc", "efabcd", "fabcde",
"abaac", "cabcab"
};
foreach (string sequence in sequences)
{
Console.WriteLine(ToCanonicalOrder(sequence));
}
output:
abcdef
abcdef
abcdef
abcdef
abcdef
abcdef
aacab
abcabc
then call .GetHashCode() on the result if necessary.
sample usage if ToCanonicalOrder() is converted to an extension method:
sequence.ToCanonicalOrder().GetHashCode();
One possibility is to combine the hash functions of all circular shifts of your input into one meta-hash which does not depend on the order of the inputs.
More formally, consider
for(int i=0; i<string.length; i++) {
result^=string.rotatedBy(i).hashCode();
}
Where you could replace the ^= with any other commutative operation.
More examply, consider the input
"abcd"
to get the hash we take
hash("abcd") ^ hash("dabc") ^ hash("cdab") ^ hash("bcda").
As we can see, taking the hash of any of these permutations will only change the order that you are evaluating the XOR, which won't change its value.
I did something like this for a project in college. There were 2 approaches I used to try to optimize a Travelling-Salesman problem. I think if the elements are NOT guaranteed to be unique, the second solution would take a bit more checking, but the first one should work.
If you can represent the string as a matrix of associations so abcdef would look like
a b c d e f
a x
b x
c x
d x
e x
f x
But so would any combination of those associations. It would be trivial to compare those matrices.
Another quicker trick would be to rotate the string so that the "first" letter is first. Then if you have the same starting point, the same strings will be identical.
Here is some Ruby code:
def normalize_string(string)
myarray = string.split(//) # split into an array
index = myarray.index(myarray.min) # find the index of the minimum element
index.times do
myarray.push(myarray.shift) # move stuff from the front to the back
end
return myarray.join
end
p normalize_string('abcdef').eql?normalize_string('defabc') # should return true
Maybe use a rolling hash for each offset (RabinKarp like) and return the minimum hash value? There could be collisions though.

Resources