Related
I have a multiple lists with data of the form:(There is a simple example, in fact, the dimension of row-vectors are much larger)
list 1: [num1] [[1,0,0,1,0], [0,0,1,0,1], [0,1,0,1,0], ...]
list 2: [num2] [[0,0,0,1,0], [1,0,0,1,0], [0,0,1,0,0], ...]
...
list n: [numn] [[1,1,0,1,0], [1,0,0,1,1], [0,0,1,0,1], ...]
Every list marked with its own number [num] (numbers are not repeated).
The main question is: How to efficently find all num's of lists with identical row-vectors from them and such vectors?
In details:
For example, the row-vector [1,0,0,1,0] occurs in list 1 and list 2, so then I should return [1,0,0,1,0] : [num1], [num2]
First of all hash tables come to mind. I think it's best to use due to the large amount of data but I know hash tables quite superficially and I can’t structurize a clear algorithm in my head with this case. Can anyone advise what should I pay attention to and what modules should I consider? Perhaps there are other efficient approaches?
It is beyond the scope of a regular question to dive into hash tables and such. But suffice to say that sets in Python are backed by hash tables and checking for set membership is almost instantaneous and much more efficient than searching through lists.
If order doesn't matter within your list of vectors, you should just think of them as unordered collections (sets). Sets need to contain immutable things, so you cannot put a list into a set, but you can put in tuples. So, if you re-structure your data to be sets of tuples, you are in good shape.
You have many "cases" of things you might do then, below are a few examples.
data = { 1: {(1, 0, 0), (1, 1, 0)},
2: {(0, 0, 0), (1, 0, 0)},
3: {(1, 0, 0), (1, 0, 1), (1, 1, 0)}}
# find common vectors in 2 sets
def common_vecs(a, b):
return a.intersection(b)
# find all the common vectors in a group of sets
def all_common_vecs(grps):
return set.intersection(*grps)
# find which sets contain a specific vector
def find(vec, data):
result = set()
for idx, grp in data.items():
if vec in grp:
result.add(idx)
return result
print(common_vecs(data[1], data[3]))
print(all_common_vecs(data.values()))
print(find((1,0,1), data))
Output:
{(1, 0, 0), (1, 1, 0)}
{(1, 0, 0)}
{3}
Is there a way of using the range() function with stride -1?
E.g. using range(10, -10) instead of the square-bracketed values below?
I.e the following line:
for y in range(10,-10)
Instead of
for y in [10,9,8,7,6,5,4,3,2,1,0,-1,-2,-3,-4,-5,-6,-7,-8,-9,-10]:
Obviously one could do this with another kind of loop more elegantly but the range() example would work much better for what I want.
You can specify the stride (including a negative stride) as the third argument, so
range(10,-11,-1)
gives
[10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0, -1, -2, -3, -4, -5, -6, -7, -8, -9, -10]
In general, it doesn't cost anything to try. You can simply type this into the interpreter and see what it does.
This is all documented here as:
range(start, stop[, step])
but mostly I'd like to encourage you to play around and see what happens. As you can see, your intuition was spot on.
Yes, by defining a step:
for i in range(10, -11, -1):
print(i)
In addition to the other good answers, there is an alternative:
for y in reversed(range(-10, 11)):
See the documentation for reversed().
You may notice that the range function works only in ascending order without the third parameter. If you use without the third parameter in the range block, it will not work.
for i in range(10,-10)
The above loop will not work.
For the above loop to work, you have to use the third parameter as negative number.
for i in range(10,-10,-1)
Yes, however you'll need to specify that you want to step backwards by setting the step argument to -1.
Use:
for y in range(10, -10, -1)
For your case using range(10,-10,-1)
will be helpful. The first argument refers to the first step, the second one refers to the last step, and the third argument refers to the size of that step.
When your range is ascending, you do not need to specify the steps if you need all numbers between, range(-10,10) or range(-10,-5).
But when your range is descending, you need to specify the step size as -1, range(10,-10,-1) or any other larger steps.
If you prefer create list in range:
numbers = list(range(-10, 10))
To summarize, these 3 are the best efficient and relevant to answer approaches I believe:
first = list(x for x in range(10, -11, -1))
second = list(range(-10, 11))
third = [x for x in reversed(range(-10, 11))]
Alternatively, NumPy would be more efficient as it creates an array as below, which is much faster than creating and writing items to the list in python. You can then convert it to the list:
import numpy as np
first = -(np.arange(10, -11, -1))
Notice the negation sign for first.
second = np.arange(-10, 11)
Convert it to the list as follow or use it as numpy.ndarray type.
to_the_list = first.tolist()
#Treversed list in reverse direction
l1=[2,4,3]
for i in range (len(l1)-1,-1,-1):
print (l1[i])
Please consider:
dalist={{1, 2, 3, 4, 5, 6, 7, 8, 9, 10},
{2.88`, 2.04`, 4.64`,0.56`, 4.92`, 2.06`, 3.46`, 2.68`, 2.72`,0.820},
{"Laura1", "Laura1", "Laura1", "Laura1", "Laura1",
"Laura1", "Laura1", "Laura1", "Laura1","Laura1"},
{"RIGHT", 0, 1, 15.1`, 0.36`, 505, 20.059375`,15.178125`, ".", "."}}
The actual dataset is about 6 000 rows and 147 columns. However the above reflects its content. I would like to compute some basic statistics, such as the mean. My attempt:
Table[Mean#dalist[[colNO]], {colNO, 1, 4}]
How could I create a function such as to:
Avoid non-numerical values and
Count the number of non numerical values found in each lists.
I have not succeeded in finding the right pattern mechanism yet.
First observation: you could use Mean /# dalist if you wanted to average across rows. You don't need a Table function here.
Try using Cases (documentation), eg. Mean /# (Cases[#,_?NumericQ] & /# dalist)
If you want to be tricky and eliminate rows from your data that have no numeric elements (eg your third column), try the following. It first picks only the rows that have some numeric elements, and then takes only the numeric elements from those rows.
Mean /# (Cases[#,_?NumericQ] & /# (Cases[dalist, {___,_?NumericQ,___}]))
To count the non-numeric elements, you would use a similar approach:
Length /# (Cases[#,Except[_?NumericQ]] & /# dalist)
This answer has the caveat that I typed it out without the benefit of a Mathematica installation to actually check my syntax. Some typos could remeain.
Here is a variation of Verbeia's answer that you may consider.
Assuming that this is a rectangular array (all rows are the same length), then setting d to the row length (which can be found with Dimensions):
d = 10;
{d - Length##, Mean##} &#Select[#, NumericQ] & /# dalist
(* Out: *) {{0, 11/2}, {0, 2.678}, {10, Mean[{}]}, {3, 79.5282}}
That is, pairs of {number_of_non-numeric, average}.
Mean[{}] appears where there are no numeric values to average. This could be removed from the list with DeleteCases but the results would no longer align with the rows of dalist. I think it would be better to use something like: /. Mean[{}] -> "NO AVERAGE" if needed.
The key to answering your question is the NumberQ function: "*NumberQ[expr] gives True if expr is a number, and False otherwise."
To compute the mean of only numeric elements in each list:
Map[Function[lst, Mean[Select[lst, NumberQ]]], dalist]
To count the number of non-numeric elements in each list:
Map[Function[lst, Length[Select[lst, Function[x, !NumberQ[x]]]]], dalist]
Here is a problem that I don't know if can be solved in Mathematica.
(* Courtesy to Lunchtime Playground Blog *)
to3d[plot_, height_, opacity_] :=
Module[{newplot}, newplot = First#Graphics[plot];
newplot = N#newplot /. {x_?AtomQ, y_?AtomQ} -> {x, y, height} /.
Arrowheads[List[List[x_, y_, notz_]]] ->
Arrowheads[List[List[x, y]]];newplot /.GraphicsComplex[xx__] -> {Opacity[opacity], GraphicsComplex[xx]}];
(* A function to combine 2D Graphics object in Mathematica *)
test[list_]:=VectorQ[list,SameQ[Head[#],Graphics]&];
My3DPlot[list_?(test[#]&),height_?(VectorQ[#,NumberQ]&),opacity_?(VectorQ[#,NumberQ]&),opts:OptionsPattern[]]:=Block[{a},a=MapThread[Graphics3D[to3d[#1,#2,#3]]&,{list,height,opacity}];
Show[a,opts]
]
(* List of 2D graphics *)
list=Table[ContourPlot[y+Sin[x^i+i y],{x,-3,3},{y,-3,3},Contours->15,ContourLines->False,ColorFunction->RandomChoice[ColorData["Gradients"]]],{i,{1,2,3,4}}];
(* List of heights where you want to place the images *)
height={-.5,0,.5,1};
(* List of opacities you want to apply to your 2D layers *)
opacity={1,.8,.7,.5};
(* The function inherits all the options of standard Graphics3D as they are passed through the Show command *)
My3DPlot[Reverse#list,height,opacity,Lighting->"Neutral",BoxRatios->{1,1,.9},Axes->True]
Now this returns a cool picture like this one.
Here my question is if it is possible to create a filling for this 2D layers using the same color functions as are used with in the contour plots for example? Target is to fill the hollow between these 2D layers with some light or color that continuously changes according to the neighboring layer color-function.
I hope this can be done in Mathematica but my limited knowledge in Mathematica graphics is making it a difficult hurdle for me.
It should be possible. Texture can be used to generate a 3D texture. The example given in the documentation:
data = Table[{r, g, b}, {r, 0, 1, 1/20}, {g, 0, 1, 1/20}, {b, 0, 1, 1/20}];
Graphics3D[
{
Opacity[1/3],
Texture[data],
EdgeForm[],
Polygon[Table[{{0, 0, z}, {1, 0, z}, {1, 1, z}, {0, 1, z}}, {z, 0, 1, 1/20}],
VertexTextureCoordinates ->
Table[{{0, 0, s}, {1, 0, s}, {1, 1, s}, {0, 1, s}}, {s, 0, 1, 1/20}]]
},
Lighting -> "Neutral"
]
This simulates a volume by using a large set of planes. You can do the same. All you have to do is describe the 3D texture, which should interpolate between the planes you already have.Blend would be the function to be used here. For each pixel column in your cube the color varies as Blend[{col1,col2,col3,...},x] with x going from 0 to 1 and coli the color of the pixel in the ith plane given by the contour plots.
The main problem will be that a 3D semi-transparant object with fuzzy color gradients is not something that visualizes very well.
I need to input a variable, say var, into Mathematica function Series[ ] like this: Series[A^2+B^2+C^2, var]. Series[ ] has the following syntax:
Series[f, {x, x_0, n}] generates a power series expansion for f about the point x=x_0 to order n.
Series[f, {x, x_0, n}, {y, y_0, m}, ...] successively finds series expansions with respect to x, then y, etc.
Because I am not always computing Series[ ] in one dimension (i.e., B and C are not always variables at each iteration), var must be properly formatted to fit the dimension demands. The caveat is that Mathematica likes lists, so any table degenerated will have a set of outer {}.
Suppose my previous code generates the following two sets of sets:
table[1]= {{A, 0, n}};
table[2]= {{A, 0, n}, {B, 0, m}}; .
My best idea is to use string manipulation (for i= 2):
string = ToString[table[i]]; .
str = StringReplacePart[string, {" ", " "}, {{1}, {StringLength[string], StringLength[string]}}]
The next step is to convert str to an expression like var and do Series[A^2 + B^2 + C^2, var] by doing var= ToExpression[str], but this returns the following error:
ToExpression::sntx: Invalid syntax in or before "{A, 0, n}, {B, 0, m}".
$Failed
Help convert str to expression propertly or suggest another way to handle this problem.
If I understood correctly, you have
table[2] = {{A, 0, n}, {B, 0, m}};
and are trying to obtain from that
Series[f[A,B],{A,0,n},{B,0,m}]
This may be done using Sequence, like so (I will use series instead of Series to keep it unevaluated so you can see what is happening):
series[f[A, B], Sequence ## table[2]]
(*
-> series[f[A,B],{A,0,n},{B,0,m}]
*)
So for instance
table[3] = {{A, 0, 2}, {B, 0, 2}};
Series[f[A, B], Sequence ## table[3]]
gives the right series expansion.
You can use First or Last or more generally, Part to get the List you want. For e.g.,
var = {{x, 0, 3}, {x, 0, 5}};
Series[1/(1 + x), var[[1]]]
Out[1]= 1 - x + x^2 - x^3 + O[x]^4
Series[1/(1 + x), var[[2]]]
Out[2]= 1 - x + x^2 - x^3 + x^4 - x^5 + O[x]^6
EDIT:
For multiple variables, you can use a SlotSequence (##) along with Apply (##) like so:
Series[Sin[u + w], ##] & ## {{u, 0, 3}, {w, 0, 3}}