copy contents of a list to a Dictionary<int, string> - c#-4.0

I am pretty new to C# and stuck with a problem with collections. I have a dictionaty with some values say: [(5, "abc"), (6, "def")]. I call a method and get a List like: {"mno", "pqr"}. Now I want to update the value of the dictionary with contents of the list. The problem is the key of the map may start from any number, say from 5. But the starting index of the new list will always be 0 as usual, the length of both the list and the map being same...so I can't so: map[i] = list[i], because they won't match. Can someone please tell me how to replace the contents of the dictionary with that of the list? Please please!!!

Simple For Loop
for (int index = 0; index < map.Count; index++)
{
map[map.ElementAt(index).Key] = list[index];
}
Using Linq
var result = map.Zip(list, (m, l) => new { Key = m.Key, Value = l })
.ToDictionary(p => p.Key, p => p.Value);

Here's a method using LINQ that will create a new Dictionary with the same keys as the old dictionary, but values coming from the list:
map = map.Keys
.OrderBy(k => k)
.Zip(list, (k, v) => new KeyValuePair<int, string>(k, v))
.ToDictionary(kvp => kvp.Key, kvp => kvp.Value);
If you need to update the original Dictionary instead of creating a new one, this will do it:
foreach (var kvp in map.Keys
.OrderBy(k => k)
.Zip(list, (k, v) => new KeyValuePair<int, string>(k, v)))
map[kvp.Key] = kvp.Value;

Related

Remove Array Json object elements

Here I have two arrays
var arry1 = [{id:1,name:"muni"},{id:2,name:"Anji"}, {id:3,name:"vinod"},{id:4,name:"anil"}];
var arry2 = [{id:3},{id:1}];
I want the following results
arry1= [{id:2,name:"Anji"},{id:4,name:"anil"}]
Should be remove second selected elements in first array
You can use Array.filter to remove any element that is present in arry2. We can create a Set of id elements to filter out, this will be more efficient for larger arrays:
var arry1 = [{id:1,name:"muni"},{id:2,name:"Anji"}, {id:3,name:"vinod"},{id:4,name:"anil"}];
var arry2 = [{id:3},{id:1}];
// Filter out any elements in arry1 that are also present in arry2, first create a Set of IDs to filter
const idsToFilter = new Set(arry2.map(el => el.id));
const result = arry1.filter(el => !idsToFilter.has(el.id));
console.log("Result:", result)
While removing from an array, you should iterate backwards over it.
for (let i = arry1.length - 1; i >=0; i--) {
...
}
This ensures that no elements are skipped after an element is removed. See also this other question for more info on this.
Now for each element of arry1 we want to check whether it should be removed.
let idsToRemove = arry2.map(e => e.id); // [3, 1]
for (let i = arry1.length - 1; i >=0; i--) {
if (idsToRemove.includes(arry1[i].id) {
// it should be removed
arry1.splice(i, 1);
}
}
Something like the above should then work for your problem. For easier understanding of the code, I first mapped arry2 to only the IDs, but of course you can also use another loop to see whether there is a match. The most important take-away is that to safely remove from an array while iterating it, you need to iterate from the last to the first element.
Try this it will work, here filter will filter out only those array element which doesn't exist in arry2
var myArray = arry1.filter(ar => !arry2.find(el => (el.id === ar.id) ))

Groovy: Get index of all occurences of sublist from arraylist

I am new to groovy and trying to find the indexes of all sublists in a list.
I am trying to use something like Collections.indexOfSubList like in java but it gives exception saying it applies on Lists and not ArrayLists.
So I am trying to define my own function. I am finding all the indices of all the elements in the smaller list existing in the longer list and then subtracting the indices of the result array. If it comes to 1 then I am considering that index to a sublist.
I know that I have the logic a little twisted. Can somebody guide with a better and efficient way of doing this.
Below is my code:
List list1 = [1,2,3,4,5,6,1,2,3]
List list2 = [1,2]
index1 = list1.findIndexValues {
it == list2[0];
}
index2 = list1.findIndexValues {
it == list2[1];
}
println index1
println index2
result = []
for (int i = 0; i < index1.size(); i++) {
result.add(index2[i]-index1[i]);
}
println result
Edit: no longer uses Collections due to new issue re: Elastic Search.
The following code traverses along the source list, creating a sublist. It checks the sublist to see if it starts with the target list. See the asserts below (e.g. the indexes are 0-based):
def listStartsWithSubList = { source, target ->
def result = false
if (source.size() >= target.size()) {
result = true
target.eachWithIndex { item, index ->
result = result && (item == source[index])
}
}
result
}
def indexOfSubLists = { source, target ->
def results = []
source.eachWithIndex { item, index ->
def tmpList = source[index..source.size()-1]
if (listStartsWithSubList(tmpList, target)) {
results << index
}
}
results
}
assert [1] == indexOfSubLists([1,2,3], [2,3])
assert [2] == indexOfSubLists([1,2,3], [3])
assert [] == indexOfSubLists([1,2,3], [4])
assert [0,6] == indexOfSubLists([1,2,3,4,5,6,1,2,3], [1,2])

I want to collect the data frame column values in an array list to conduct some computations, is it possible?

I am loading data from phoenix through this:
val tableDF = sqlContext.phoenixTableAsDataFrame("Hbtable", Array("ID", "distance"), conf = configuration)
and want to carry out the following computation on the column values distance:
val list=Array(10,20,30,40,10,20,0,10,20,30,40,50,60)//list of values from the column distance
val first=list(0)
val last=list(list.length-1)
var m = 0;
for (a <- 0 to list.length-2) {
if (list(a + 1) < list(a) && list(a+1)>=0)
{
m = m + list(a)
}
}
val totalDist=(m+last-first)
You can do something like this. It returns Array[Any]
`val array = df.select("distance").rdd.map(r => r(0)).collect()
If you want to get the data type properly, then you can use. It returns the Array[Int]
val array = df.select("distance").rdd.map(r => r(0).asInstanceOf[Int]).collect()

Matrix Transpose on RowMatrix in Spark

Suppose I have a RowMatrix.
How can I transpose it. The API documentation does not seem to have a transpose method.
The Matrix has the transpose() method. But it is not distributed. If I have a large matrix greater that the memory how can I transpose it?
I have converted a RowMatrix to DenseMatrix as follows
DenseMatrix Mat = new DenseMatrix(m,n,MatArr);
which requires converting the RowMatrix to JavaRDD and converting JavaRDD to an array.
Is there any other convenient way to do the conversion?
Thanks in advance
If anybody interested, I've implemented the distributed version #javadba had proposed.
def transposeRowMatrix(m: RowMatrix): RowMatrix = {
val transposedRowsRDD = m.rows.zipWithIndex.map{case (row, rowIndex) => rowToTransposedTriplet(row, rowIndex)}
.flatMap(x => x) // now we have triplets (newRowIndex, (newColIndex, value))
.groupByKey
.sortByKey().map(_._2) // sort rows and remove row indexes
.map(buildRow) // restore order of elements in each row and remove column indexes
new RowMatrix(transposedRowsRDD)
}
def rowToTransposedTriplet(row: Vector, rowIndex: Long): Array[(Long, (Long, Double))] = {
val indexedRow = row.toArray.zipWithIndex
indexedRow.map{case (value, colIndex) => (colIndex.toLong, (rowIndex, value))}
}
def buildRow(rowWithIndexes: Iterable[(Long, Double)]): Vector = {
val resArr = new Array[Double](rowWithIndexes.size)
rowWithIndexes.foreach{case (index, value) =>
resArr(index.toInt) = value
}
Vectors.dense(resArr)
}
You can use BlockMatrix, which can be created from an IndexedRowMatrix:
BlockMatrix matA = (new IndexedRowMatrix(...).toBlockMatrix().cache();
matA.validate();
BlockMatrix matB = matA.transpose();
Then, can be easily put back as IndexedRowMatrix. This is described in the spark documentation.
You are correct: there is no
RowMatrix.transpose()
method. You will need to do this operation manually.
Here is the non-distributed/local matrix versions:
def transpose(m: Array[Array[Double]]): Array[Array[Double]] = {
(for {
c <- m(0).indices
} yield m.map(_(c)) ).toArray
}
The distributed version would be along the following lines:
origMatRdd.rows.zipWithIndex.map{ case (rvect, i) =>
rvect.zipWithIndex.map{ case (ax, j) => ((j,(i,ax))
}.groupByKey
.sortBy{ case (i, ax) => i }
.foldByKey(new DenseVector(origMatRdd.numRows())) { case (dv, (ix,ax)) =>
dv(ix) = ax
}
Caveat: I have not tested the above: it will have bugs. But the basic approach is valid - and similar to work I had done in the past for a small LinAlg library for spark.
For very large and sparse matrix, (like the one you get from text feature extraction), the best and easiest way is:
def transposeRowMatrix(m: RowMatrix): RowMatrix = {
val indexedRM = new IndexedRowMatrix(m.rows.zipWithIndex.map({
case (row, idx) => new IndexedRow(idx, row)}))
val transposed = indexedRM.toCoordinateMatrix().transpose.toIndexedRowMatrix()
new RowMatrix(transposed.rows
.map(idxRow => (idxRow.index, idxRow.vector))
.sortByKey().map(_._2))
}
For not so sparse matrix, you can use BlockMatrix as the bridge as mentioned by aletapool's answer above.
However aletapool's answer misses a very important point: When you start from RowMaxtrix -> IndexedRowMatrix -> BlockMatrix -> transpose -> BlockMatrix -> IndexedRowMatrix -> RowMatrix, in the last step (IndexedRowMatrix -> RowMatrix), you have to do a sort. Because by default, converting from IndexedRowMatrix to RowMatrix, the index is simply dropped and the order will be messed up.
val data = Array(
MllibVectors.sparse(5, Seq((1, 1.0), (3, 7.0))),
MllibVectors.dense(2.0, 0.0, 3.0, 4.0, 5.0),
MllibVectors.dense(4.0, 0.0, 0.0, 6.0, 7.0),
MllibVectors.sparse(5, Seq((2, 2.0), (3, 7.0))))
val dataRDD = sc.parallelize(data, 4)
val testMat: RowMatrix = new RowMatrix(dataRDD)
testMat.rows.collect().map(_.toDense).foreach(println)
[0.0,1.0,0.0,7.0,0.0]
[2.0,0.0,3.0,4.0,5.0]
[4.0,0.0,0.0,6.0,7.0]
[0.0,0.0,2.0,7.0,0.0]
transposeRowMatrix(testMat).
rows.collect().map(_.toDense).foreach(println)
[0.0,2.0,4.0,0.0]
[1.0,0.0,0.0,0.0]
[0.0,3.0,0.0,2.0]
[7.0,4.0,6.0,7.0]
[0.0,5.0,7.0,0.0]
Getting the transpose of RowMatrix in Java:
public static RowMatrix transposeRM(JavaSparkContext jsc, RowMatrix mat){
List<Vector> newList=new ArrayList<Vector>();
List<Vector> vs = mat.rows().toJavaRDD().collect();
double [][] tmp=new double[(int)mat.numCols()][(int)mat.numRows()] ;
for(int i=0; i < vs.size(); i++){
double[] rr=vs.get(i).toArray();
for(int j=0; j < mat.numCols(); j++){
tmp[j][i]=rr[j];
}
}
for(int i=0; i < mat.numCols();i++)
newList.add(Vectors.dense(tmp[i]));
JavaRDD<Vector> rows2 = jsc.parallelize(newList);
RowMatrix newmat = new RowMatrix(rows2.rdd());
return (newmat);
}
This is a variant of the previous solution but working for sparse row matrix and keeping the transposed sparse too when needed:
def transpose(X: RowMatrix): RowMatrix = {
val m = X.numRows ().toInt
val n = X.numCols ().toInt
val transposed = X.rows.zipWithIndex.flatMap {
case (sp: SparseVector, i: Long) => sp.indices.zip (sp.values).map {case (j, value) => (i, j, value)}
case (dp: DenseVector, i: Long) => Range (0, n).toArray.zip (dp.values).map {case (j, value) => (i, j, value)}
}.sortBy (t => t._1).groupBy (t => t._2).map {case (i, g) =>
val (indices, values) = g.map {case (i, j, value) => (i.toInt, value)}.unzip
if (indices.size == m) {
(i, Vectors.dense (values.toArray) )
} else {
(i, Vectors.sparse (m, indices.toArray, values.toArray))
}
}.sortBy(t => t._1).map (t => t._2)
new RowMatrix (transposed)
}
Hope this help!

How to transform a list of dictionaries with unique keys to a dictionary whose value is a list?

I have an arbitrary number of dictionaries (which are in a list, already in order) that I wish to outer join. For example, for N = 2:
List<Dictionary<string, int>> lstInput = new List<Dictionary<string, int>>();
Dictionary<string, int> dctTest1 = new Dictionary<string, int>();
Dictionary<string, int> dctTest2 = new Dictionary<string, int>();
dctTest1.Add("ABC", 123);
dctTest2.Add("ABC", 321);
dctTest2.Add("CBA", 321);
lstInput.Add(dctTest1);
lstInput.Add(dctTest2);
Each dictionary already has unique keys.
I wish to transform lstInput into:
Dictionary<string, int[]> dctOutput = new Dictionary<string, int[]>();
where dctOutput looks like:
"ABC": [123, 321]
"CBA": [0, 321]
That is, the set of keys of dctOutput is equal to the union of the set of keys of each dictionary in lstInput; moreover, the *i*th position of each value in dctOutput is equal to the value of the corresponding key in the *i*th dictionary in lstInput, or 0 if there is no corresponding key.
How can I write C# code to accomplish this?
The following should do what you want.
var dctOutput = new Dictionary<string, int[]>();
for (int i = 0; i < lstInput.Count; ++i)
{
var dict = lstInput[i];
foreach (var kvp in dict)
{
int[] values;
if (!dctOutput.TryGetValue(kvp.Key, out values))
{
// Allocating the array zeros the values
values = new int[lstInput.Count];
dctOutput.Add(kvp.Key, values);
}
values[i] = kvp.Value;
}
}
This works because allocating the array initializes all values to 0. So if a previous dictionary didn't have an item with that key, its values will be 0 in that position. If you wanted your sentinel value to be something other than 0, then you would initialize the array with that value after allocating it.

Resources