color founding multiple colours - python-3.x

I first created a dictionary of 21 different color codes with their names
rgb_colors = {"Red":[1.0,0.0,0.0],"Green":[0.0,1.0,0.0],"Blue":[0.0,0.0,1.0],
"Black":[0.0,0.0,0.0],"Almond":[0.94,0.87,0.8],"White":[1.0,1.0,1.0],
"Brown":[0.8,0.5,0.2],"Cadet":[0.33,0.41,0.47],"Camel":[0.76,0.6,0.42],
"Capri":[0.0,0.75,1.0],"Cardinal":[0.77,0.12,0.23],"Ceil":[0.57,0.63,0.81],
"Celadon":[0.67,0.88,0.69],"Champagne":[0.97,0.91,0.81],"Charcoal":[0.21,0.27,0.31],
"Cream":[1.0,0.99,0.82],"Cyan":[0.0,1.0,1.0],"DarkBlue":[0.0,0.0,0.55],
"AmericanRose":[1.0,0.01,0.24],"Gray":[0.5,0.5,0.5],"Wenge":[0.39,0.33,0.32]}
Then I converted it to Df
RGB = pd.DataFrame(rgb_colors.items(), columns = ["Color","Color Code"])
Then I created a list of all the color codes
and asked for input code. then I used the input color and and found the Euclidean distance between each color code to the input and asset a threshold to select the code that matches at least 60% and used the top three codes as the closest colour.
#list of colors
list_of_rgb = [[1.0,0.0,0.0],[0.0,1.0,0.0],[0.0,0.0,1.0],[0.0,0.0,0.0],[0.94,0.87,0.8],
[1.0,1.0,1.0],[0.8,0.5,0.2],[0.33,0.41,0.47],[0.76,0.6,0.42],
[0.0,0.75,1.0],[0.77,0.12,0.23],[0.57,0.63,0.81],
[0.67,0.88,0.69],[0.97,0.91,0.81],[0.21,0.27,0.31],
[1.0,0.99,0.82],[0.0,1.0,1.0],[0.0,0.0,0.55],[1.0,0.01,0.24]
,[0.5,0.5,0.5],[0.39,0.33,0.32]]
#input color
print("Enter R,G,B color codes")
color1 = []
for i in range(0,3):
ele = float(input())
color1.append(ele)
print(color1)
def closest(colors,color, threshold=60, max_return=3):
colors = np.array(colors)
color = np.array(color)
distances = np.sqrt(np.sum((colors-color)**2,axis=1))
boolean_masks = distances < (1.0 - (threshold / 100))
outputs = colors[boolean_masks]
output_distances = distances[boolean_masks]
return outputs[np.argsort(output_distances)][:max_return]
closest_color = closest(list_of_rgb, color1)
closest_color
suppose the Input is [0.52,0.5,0.5]
then closest colors are
array([[0.5 , 0.5 , 0.5 ],
[0.76, 0.6 , 0.42],
[0.8 , 0.5 , 0.2 ]])
My question is, how can I find how much percentage of each of these closest color should be used to get the input color?
It can be solved by finding 3 proportions p1,p2 and p3 such that p1+p2+p3=1 and
p1*(r1,g1,b1) + p2*(r2,g2,b2) + p3*(r3,g3,b3) = (r0,g0,b0)
I'm unable to find p1,p2 and p3. Can anyone help me out on how can I find the p values?

The system of linear equations you are setting up is over-determined, meaning that in general there is no solution.
The additional constraints on the proportions (or more precisely weights) -- summing up to 1, being in the [0, 1] range -- make things worse because even in case a solution exists, it may be discarded because of those additional constraints.
The question in its current form is not mathematically solvable.
Whether you want to include a fixed sum constraints or not, the mathematics for finding the best weights for a linear combination is very similar and although exact solutions are not always attainable, it is possible get to approximate solutions.
One way of computing the is through linear programming, which gets you essentially to #greenerpastures's answer, but requires you to use linear programming.
Using Brute Force and Simple Least Squares
Here I propose a more basic approach where only simple linear algebra is involved, but ignores the requirement for the weights being in the [0, 1] range (which may be introduced afterwards).
The equations for writing a target color b as a linear combination of colors can be written in matrix form as:
A x = b
with A formed by the colors you want to use, b is the target color and x are the weights.
/ r0 r1 r2 \ / r_ \
| g0 g1 g2 | (x0, x1, x2) = | g_ |
\ b0 b1 b2 / \ b_ /
Now, this system of equations admits a single solution if det(A) != 0.
Since among the selected colors there is an ortho-normal basis, you can actually use those to construct an A with det(A) != 0, and hence a x can always be found.
If the elements of b are in the [0, 1] range, so are the elements of x, because essentially b = x.
In general, you can find the solution of the Ax = b linear system of equations with np.linalg.solve(), which can be used to look for x when A is formed by other colors, as long as det(A) != 0.
If you want to include more or less than as many colors as the number of channels, then approximate solutions minimizing the sum of squares can be obtained with np.linalg.lstsq() which implements least squares approximation: finds the best weights to assign to n vectors (components), such that their linear combination (weighted sum) is as close as possible (minimizes the sum of squares) to the target vector.
Once you are set to find approximate solution, the requirement on the sum of the weights becomes an additional parameter in the system of linear equation.
This can be included by simply augmenting A and b with an extra dimension set to 1 for A and to q for b, so that A x = b becomes:
/ r0 r1 r2 \ / r3 \
| g0 g1 g2 | (p0, p1, p2) = | g3 |
| b0 b1 b2 | | b3 |
\ 1 1 1 / \ q /
Now the new equation p0 + p1 + p2 = q is included.
While all this can work with arbitrary colors, the ones selected by closeness are not necessarily going to be good candidates to approximate well an arbitrary color.
For example, if the target color is (1, 0, 1) and the 3 closest colors happen to be proportional to each other, say (0.9, 0, 0), (0.8, 0, 0), (0.7, 0, 0), it may be better to use say (0, 0, 0.5) which is farther but can contribute better to make a good approximation than say (0.7, 0, 0).
Given that the number of possible combinations is fairly small, it is possible to try out all colors, in groups of fixed increasing size.
This approach is called brute-force, because we try out all of them.
The least squares method is used to find the weights.
Then we can add additional logic to enforce the constraints we want.
To enforce the weights to sum up to one, it is possible to explicitly normalize them.
To restrict them to a particular range, we can discard weights not complying (perhaps with some tolerance atol to mitigate numerical issues with float comparisons).
The code would read:
import itertools
import dataclasses
from typing import Optional, Tuple, Callable, Sequence
import numpy as np
def best_linear_approx(target: np.ndarray, components: np.ndarray) -> np.ndarray:
coeffs, _, _, _ = np.linalg.lstsq(components, target, rcond=-1)
return coeffs
#dataclasses.dataclass
class ColorDecomposition:
color: np.ndarray
weights: np.ndarray
components: np.ndarray
indices: np.ndarray
cost: float
sum_weights: float
def decompose_bf_lsq(
color: Sequence,
colors: Sequence,
max_nums: int = 3,
min_nums: int = 1,
min_weights: float = 0.0,
max_weights: float = 1.0,
atol: float = 1e-6,
norm_in_cost: bool = False,
force_norm: bool = False,
) -> Optional[ColorDecomposition]:
"""Decompose `color` into a linear combination of a number of `colors`.
This perfoms a brute-force search.
Some constraints can be introduced into the decomposition:
- The weights within a certain range ([`min_weights`, `max_weights`])
- The weights to accumulate (sum or average) to a certain value.
The colors are chosen to have minimum sum of squared differences (least squares).
Additional costs may be introduced in the brute-force search, to favor
particular solutions when the least squares are the same.
Args:
color: The color to decompose.
colors: The base colors to use for the decomposition.
max_nums: The maximum number of base colors to use.
min_weights: The minimum value for the weights.
max_weights: The maximum value for the weights.
atol: The tolerance on the weights.
norm_in_cost: Include the norm in the cost for the least squares.
force_norm: If True, the weights are normalized to `acc_to`, if set.
weight_costs: The additional weight costs to prefer specific solutions.
Returns:
The resulting color decomposition.
"""
color = np.array(color)
colors = np.array(colors)
num_colors, num_channels = colors.shape
# augment color/colors
if norm_in_cost:
colors = np.concatenate(
[colors, np.ones(num_colors, dtype=colors.dtype)[:, None]],
axis=1,
)
color = np.concatenate([color, np.ones(1, dtype=colors.dtype)])
# brute-force search
best_indices = None
best_weights = np.zeros(1)
best_cost = np.inf
for n in range(min_nums, max_nums + 1):
for indices in itertools.combinations(range(num_colors), n):
if np.allclose(color, np.zeros_like(color)):
# handles the 0 case
weights = np.ones(n)
else:
# find best linear approx
weights = best_linear_approx(color, colors[indices, :].T)
# weights normalization
if force_norm and np.all(weights > 0.0):
norm = np.sum(weights)
weights /= norm
# add some tolerance
if atol > 0:
mask = np.abs(weights) > atol
weights = weights[mask]
indices = np.array(indices)[mask].tolist()
if atol > 0 and max_weights is not None:
mask = (weights > max_weights - atol) & (weights < max_weights + atol)
weights[mask] = max_weights
if atol > 0 and min_weights is not None:
mask = (weights < min_weights + atol) & (weights > min_weights - atol)
weights[mask] = min_weights
# compute the distance between the current approximation and the target
err = color - (colors[indices, :].T # weights)
curr_cost = np.sum(err * err)
if (
curr_cost <= best_cost
and (min_weights is None or np.all(weights >= min_weights))
and (max_weights is None or np.all(weights <= max_weights))
):
best_indices = indices
best_weights = weights
best_cost = curr_cost
if best_indices is not None:
return ColorDecomposition(
color=(colors[best_indices, :].T # best_weights)[:num_channels],
weights=best_weights,
components=[c for c in colors[best_indices, :num_channels]],
indices=best_indices,
cost=best_cost,
sum_weights=np.sum(best_weights),
)
else:
return None
This can be used as follows:
colors = [
[1.0, 0.0, 0.0],
[0.0, 1.0, 0.0],
[0.0, 0.0, 1.0],
[0.0, 0.0, 0.0],
[0.94, 0.87, 0.8],
[1.0, 1.0, 1.0],
[0.8, 0.5, 0.2],
[0.33, 0.41, 0.47],
[0.76, 0.6, 0.42],
[0.0, 0.75, 1.0],
[0.77, 0.12, 0.23],
[0.57, 0.63, 0.81],
[0.67, 0.88, 0.69],
[0.97, 0.91, 0.81],
[0.21, 0.27, 0.31],
[1.0, 0.99, 0.82],
[0.0, 1.0, 1.0],
[0.0, 0.0, 0.55],
[1.0, 0.01, 0.24],
[0.5, 0.5, 0.5],
[0.39, 0.33, 0.32],
]
some_colors = [[0.9, 0.6, 0.5], [0.52, 0.5, 0.5], [0.5, 0.5, 0.5], [0, 0, 0], [1, 1, 1]]
# some_colors = [[0., 0., 0.]]
for color in some_colors:
print(color)
print(decompose_bf_lsq(color, colors, max_nums=1))
print(decompose_bf_lsq(color, colors, max_nums=2))
print(decompose_bf_lsq(color, colors))
print(decompose_bf_lsq(color, colors, min_weights=0.0, max_weights=1.0))
print(decompose_bf_lsq(color, colors, norm_in_cost=True))
print(decompose_bf_lsq(color, colors, force_norm=True))
print(decompose_bf_lsq(color, colors, norm_in_cost=True, force_norm=True))
# [0.9, 0.6, 0.5]
# ColorDecomposition(color=array([0.72956991, 0.68444188, 0.60922849]), weights=array([0.75213393]), components=[array([0.97, 0.91, 0.81])], indices=[13], cost=0.048107706898684606, sum_weights=0.7521339326213355)
# ColorDecomposition(color=array([0.9 , 0.60148865, 0.49820272]), weights=array([0.2924357, 0.6075643]), components=[array([1., 0., 0.]), array([1. , 0.99, 0.82])], indices=[0, 15], cost=5.446293494705139e-06, sum_weights=0.8999999999999999)
# ColorDecomposition(color=array([0.9, 0.6, 0.5]), weights=array([0.17826087, 0.91304348, 0.43478261]), components=[array([0., 0., 1.]), array([0.8, 0.5, 0.2]), array([0.39, 0.33, 0.32])], indices=[2, 6, 20], cost=0.0, sum_weights=1.526086956521739)
# ColorDecomposition(color=array([0.9, 0.6, 0.5]), weights=array([0.17826087, 0.91304348, 0.43478261]), components=[array([0., 0., 1.]), array([0.8, 0.5, 0.2]), array([0.39, 0.33, 0.32])], indices=[2, 6, 20], cost=0.0, sum_weights=1.526086956521739)
# ColorDecomposition(color=array([0.9, 0.6, 0.5]), weights=array([0.4, 0.1, 0.5]), components=[array([1., 0., 0.]), array([0., 1., 0.]), array([1., 1., 1.])], indices=[0, 1, 5], cost=2.6377536518327582e-30, sum_weights=0.9999999999999989)
# ColorDecomposition(color=array([0.9, 0.6, 0.5]), weights=array([0.4, 0.1, 0.5]), components=[array([1., 0., 0.]), array([0., 1., 0.]), array([1., 1., 1.])], indices=[0, 1, 5], cost=3.697785493223493e-32, sum_weights=0.9999999999999999)
# ColorDecomposition(color=array([0.9, 0.6, 0.5]), weights=array([0.4, 0.1, 0.5]), components=[array([1., 0., 0.]), array([0., 1., 0.]), array([1., 1., 1.])], indices=[0, 1, 5], cost=1.355854680848614e-31, sum_weights=1.0)
# [0.52, 0.5, 0.5]
# ColorDecomposition(color=array([0.50666667, 0.50666667, 0.50666667]), weights=array([0.50666667]), components=[array([1., 1., 1.])], indices=[5], cost=0.0002666666666666671, sum_weights=0.5066666666666667)
# ColorDecomposition(color=array([0.52, 0.5 , 0.5 ]), weights=array([0.52, 0.5 ]), components=[array([1., 0., 0.]), array([0., 1., 1.])], indices=[0, 16], cost=2.465190328815662e-32, sum_weights=1.02)
# ColorDecomposition(color=array([0.52, 0.5 , 0.5 ]), weights=array([0.2 , 0.2 , 0.508]), components=[array([0.76, 0.6 , 0.42]), array([0.57, 0.63, 0.81]), array([0.5, 0.5, 0.5])], indices=[8, 11, 19], cost=0.0, sum_weights=0.9079999999999999)
# ColorDecomposition(color=array([0.52, 0.5 , 0.5 ]), weights=array([0.2 , 0.2 , 0.508]), components=[array([0.76, 0.6 , 0.42]), array([0.57, 0.63, 0.81]), array([0.5, 0.5, 0.5])], indices=[8, 11, 19], cost=0.0, sum_weights=0.9079999999999999)
# ColorDecomposition(color=array([0.52, 0.5 , 0.5 ]), weights=array([0.02, 0.48, 0.5 ]), components=[array([1., 0., 0.]), array([0., 0., 0.]), array([1., 1., 1.])], indices=[0, 3, 5], cost=2.0954117794933126e-31, sum_weights=0.9999999999999996)
# ColorDecomposition(color=array([0.52, 0.5 , 0.5 ]), weights=array([0.02, 1. ]), components=[array([1., 0., 0.]), array([0.5, 0.5, 0.5])], indices=[0, 19], cost=0.0, sum_weights=1.02)
# ColorDecomposition(color=array([0.52, 0.5 , 0.5 ]), weights=array([0.02, 0.02, 0.96]), components=[array([1., 0., 0.]), array([1., 1., 1.]), array([0.5, 0.5, 0.5])], indices=[0, 5, 19], cost=9.860761315262648e-32, sum_weights=1.0)
# [0.5, 0.5, 0.5]
# ColorDecomposition(color=array([0.5, 0.5, 0.5]), weights=array([1.]), components=[array([0.5, 0.5, 0.5])], indices=[19], cost=0.0, sum_weights=1.0)
# ColorDecomposition(color=array([0.5, 0.5, 0.5]), weights=array([1.]), components=[array([0.5, 0.5, 0.5])], indices=[19], cost=0.0, sum_weights=1.0)
# ColorDecomposition(color=array([0.5, 0.5, 0.5]), weights=array([1.]), components=[array([0.5, 0.5, 0.5])], indices=[19], cost=0.0, sum_weights=1.0)
# ColorDecomposition(color=array([0.5, 0.5, 0.5]), weights=array([1.]), components=[array([0.5, 0.5, 0.5])], indices=[19], cost=0.0, sum_weights=1.0)
# ColorDecomposition(color=array([0.5, 0.5, 0.5]), weights=array([1.]), components=[array([0.5, 0.5, 0.5])], indices=[19], cost=0.0, sum_weights=1.0)
# ColorDecomposition(color=array([0.5, 0.5, 0.5]), weights=array([1.]), components=[array([0.5, 0.5, 0.5])], indices=[19], cost=0.0, sum_weights=1.0)
# ColorDecomposition(color=array([0.5, 0.5, 0.5]), weights=array([1.]), components=[array([0.5, 0.5, 0.5])], indices=[19], cost=0.0, sum_weights=1.0)
# [0, 0, 0]
# ColorDecomposition(color=array([0., 0., 0.]), weights=array([1.]), components=[array([0., 0., 0.])], indices=[3], cost=0.0, sum_weights=1.0)
# ColorDecomposition(color=array([0., 0., 0.]), weights=array([1.]), components=[array([0., 0., 0.])], indices=[3], cost=0.0, sum_weights=1.0)
# ColorDecomposition(color=array([0., 0., 0.]), weights=array([1.]), components=[array([0., 0., 0.])], indices=[3], cost=0.0, sum_weights=1.0)
# ColorDecomposition(color=array([0., 0., 0.]), weights=array([1.]), components=[array([0., 0., 0.])], indices=[3], cost=0.0, sum_weights=1.0)
# ColorDecomposition(color=array([0., 0., 0.]), weights=array([1.]), components=[array([0., 0., 0.])], indices=[3], cost=0.0, sum_weights=1.0)
# ColorDecomposition(color=array([0., 0., 0.]), weights=array([1.]), components=[array([0., 0., 0.])], indices=[3], cost=0.0, sum_weights=1.0)
# ColorDecomposition(color=array([0., 0., 0.]), weights=array([1.]), components=[array([0., 0., 0.])], indices=[3], cost=0.0, sum_weights=1.0)
# [1, 1, 1]
# ColorDecomposition(color=array([1., 1., 1.]), weights=array([1.]), components=[array([1., 1., 1.])], indices=[5], cost=0.0, sum_weights=1.0)
# ColorDecomposition(color=array([1., 1., 1.]), weights=array([1.]), components=[array([1., 1., 1.])], indices=[5], cost=0.0, sum_weights=1.0)
# ColorDecomposition(color=array([1., 1., 1.]), weights=array([0.1610306 , 0.96618357, 0.28692724]), components=[array([0.21, 0.27, 0.31]), array([1. , 0.99, 0.82]), array([0. , 0. , 0.55])], indices=[14, 15, 17], cost=0.0, sum_weights=1.4141414141414144)
# ColorDecomposition(color=array([1., 1., 1.]), weights=array([0.1610306 , 0.96618357, 0.28692724]), components=[array([0.21, 0.27, 0.31]), array([1. , 0.99, 0.82]), array([0. , 0. , 0.55])], indices=[14, 15, 17], cost=0.0, sum_weights=1.4141414141414144)
# ColorDecomposition(color=array([1., 1., 1.]), weights=array([1.]), components=[array([1., 1., 1.])], indices=[5], cost=0.0, sum_weights=1.0)
# ColorDecomposition(color=array([1., 1., 1.]), weights=array([1.]), components=[array([1., 1., 1.])], indices=[5], cost=0.0, sum_weights=1.0)
# ColorDecomposition(color=array([1., 1., 1.]), weights=array([1.]), components=[array([1., 1., 1.])], indices=[5], cost=0.0, sum_weights=1.0)
Using Brute-Force and bounded Minimization
This is essentially the same as above, except the now we use a more sophisticated optimization method than simple unbounded least squares.
This provides us with weights that are already bounded so there is no need for the additional code to handle that case and, most importantly, the optimal solutions are discarded solely on cost.
The approach would read:
import scipy.optimize
def _err(x, A, b):
return b - A # x
def cost(x, A, b):
err = _err(x, A, b)
return np.sum(err * err)
def decompose_bf_min(
color: Sequence,
colors: Sequence,
max_nums: int = 3,
min_nums: int = 1,
min_weights: float = 0.0,
max_weights: float = 1.0,
normalize: bool = False,
) -> Optional[ColorDecomposition]:
color = np.array(color)
colors = np.array(colors)
num_colors, num_channels = colors.shape
# augment color/colors to include norm in cost
if normalize:
colors = np.concatenate(
[colors, np.ones(num_colors, dtype=colors.dtype)[:, None]],
axis=1,
)
color = np.concatenate([color, np.ones(1, dtype=colors.dtype)])
# brute-force search
best_indices = None
best_weights = np.zeros(1)
best_cost = np.inf
for n in range(min_nums, max_nums + 1):
for indices in itertools.combinations(range(num_colors), n):
weights = np.full(n, 1 / n)
if not np.allclose(color, 0):
res = scipy.optimize.minimize(
cost,
weights,
(colors[indices, :].T, color),
bounds=[(min_weights, max_weights) for _ in range(n)]
)
weights = res.x
curr_cost = cost(weights, colors[indices, :].T, color)
if curr_cost <= best_cost:
best_indices = indices
best_weights = weights
best_cost = curr_cost
if best_indices is not None:
return ColorDecomposition(
color=(colors[best_indices, :].T # best_weights)[:num_channels],
weights=best_weights,
components=[c for c in colors[best_indices, :num_channels]],
indices=best_indices,
cost=best_cost,
sum_weights=np.sum(best_weights),
)
else:
return None
which works as follows:
some_colors = [[0.9, 0.6, 0.5], [0.52, 0.5, 0.5], [0.5, 0.5, 0.5], [0, 0, 0], [1, 1, 1]]
# some_colors = [[0., 0., 0.]]
for color in some_colors:
print(color)
print(decompose_bf_min(color, colors))
print(decompose_bf_min(color, colors, normalize=True))
# [0.9, 0.6, 0.5]
# ColorDecomposition(color=array([0.9, 0.6, 0.5]), weights=array([0.42982455, 0.2631579 , 0.70701754]), components=[array([0.8, 0.5, 0.2]), array([0.77, 0.12, 0.23]), array([0.5, 0.5, 0.5])], indices=(6, 10, 19), cost=2.3673037349051385e-17, sum_weights=1.399999995602849)
# ColorDecomposition(color=array([0.89999998, 0.60000001, 0.49999999]), weights=array([0.4 , 0.10000003, 0.49999999]), components=[array([1., 0., 0.]), array([0., 1., 0.]), array([1., 1., 1.])], indices=(0, 1, 5), cost=6.957464274781682e-16, sum_weights=1.0000000074212045)
# [0.52, 0.5, 0.5]
# ColorDecomposition(color=array([0.52, 0.5 , 0.5 ]), weights=array([0.02, 0. , 1. ]), components=[array([1., 0., 0.]), array([1. , 0.99, 0.82]), array([0.5, 0.5, 0.5])], indices=(0, 15, 19), cost=2.1441410828292465e-17, sum_weights=1.019999995369513)
# ColorDecomposition(color=array([0.52000021, 0.50000018, 0.50000018]), weights=array([0.02000003, 0.02000077, 0.95999883]), components=[array([1., 0., 0.]), array([1., 1., 1.]), array([0.5, 0.5, 0.5])], indices=(0, 5, 19), cost=2.517455337509621e-13, sum_weights=0.9999996259509482)
# [0.5, 0.5, 0.5]
# ColorDecomposition(color=array([0.5, 0.5, 0.5]), weights=array([0., 0., 1.]), components=[array([0., 1., 1.]), array([1. , 0.01, 0.24]), array([0.5, 0.5, 0.5])], indices=(16, 18, 19), cost=0.0, sum_weights=1.0)
# ColorDecomposition(color=array([0.5, 0.5, 0.5]), weights=array([0., 1., 0.]), components=[array([1. , 0.01, 0.24]), array([0.5, 0.5, 0.5]), array([0.39, 0.33, 0.32])], indices=(18, 19, 20), cost=0.0, sum_weights=1.0)
# [0, 0, 0]
# ColorDecomposition(color=array([0., 0., 0.]), weights=array([1.]), components=[array([0., 0., 0.])], indices=(3,), cost=0.0, sum_weights=1.0)
# ColorDecomposition(color=array([0., 0., 0.]), weights=array([1., 0., 0.]), components=[array([0., 0., 0.]), array([0. , 0. , 0.55]), array([1. , 0.01, 0.24])], indices=(3, 17, 18), cost=0.0, sum_weights=1.0)
# [1, 1, 1]
# ColorDecomposition(color=array([1., 1., 1.]), weights=array([1., 0., 0.]), components=[array([1., 1., 1.]), array([0., 1., 1.]), array([1. , 0.01, 0.24])], indices=(5, 16, 18), cost=0.0, sum_weights=1.0)
# ColorDecomposition(color=array([1., 1., 1.]), weights=array([1., 0., 0.]), components=[array([1., 1., 1.]), array([1. , 0.01, 0.24]), array([0.5, 0.5, 0.5])], indices=(5, 18, 19), cost=0.0, sum_weights=1.0)

You can make a system of equations to come up with a weighting vector which will tell you the combination of the three colors which exactly equals the input. The form of this is Ax=b, where A is the color matrix, x are the unknown variables to solve, and b is the color target. You have already calculated the 'A' in this situation, it just needs to be transposed. Of course, you also have your target color. However, this is not mapped to the fuzzy set (i.e. from 0 to 1 inclusive). If, for instance, you can only vary the intensity (from 0 to 1 or equivalently 0% to 100%) of the three colors to achieve this input color, then this approach is not sufficient. If you need each of the weighting values to be between 0 and 1, then you can solve a linear program in which you specify the constraints of 0<=w<=1 on the weights. That seems a little complicated for this, but it can be done if that's an interest.
Edit: I added the linear program to solve the problem. Linear programs are used to solve complex optimization problems where there are constraints that are placed on the variables in the system of equations. They are very powerful and can accomplish a lot. Unfortunately, it raises the complexity of the code quite a bit. Also, just to let you know, there is no guarantee that there will be a solution in which all the variables are in the set [0,1]. I believe in this particular example, it is not possible, but it does get very close.
import numpy as np
# target vector
input_color = np.array([.52, .5, .5])
input_color = np.reshape(input_color, (1, len(input_color))).T
# create color matrix with 3 chosen colors
color_1 = np.array([.5, .5, .5])
color_2 = np.array([.76, .6, .42])
color_3 = np.array([.8, .5, .2])
C = np.vstack([color_1, color_2, color_3]).T
# use linear algebra to solve for variables
weights = np.matmul(np.linalg.pinv(C),input_color)
# show that the correct values for each color were calculated
print(weights[0]*color_1 + weights[1]*color_2 + weights[2]*color_3)
from scipy.optimize import linprog
color_1 = np.array([.5, .5, .5])
color_2 = np.array([.76, .6, .42])
color_3 = np.array([.8, .5, .2])
# make variables greater than zero
ineq_1 = np.array([-1, 0, 0])
ineq_2 = np.array([0, -1, 0])
ineq_3 = np.array([0, 0, -1])
# make variables less than or equal to one
ineq_4 = np.array([1, 0, 0])
ineq_5 = np.array([0, 1, 0])
ineq_6 = np.array([0, 0, 1])
C = np.vstack([color_1, color_2, color_3]).T
C = np.vstack([C, ineq_1, ineq_2, ineq_3, ineq_4, ineq_5, ineq_6])
A = C
input_color = np.array([.52, .5, .5])
b = np.concatenate((input_color, np.array([0, 0, 0, 1, 1, 1])),axis=0)
b = np.reshape(b, (1, len(b))).T
# scipy minimizes, so maximize by multiplying by -1
c = -1*np.array([1, 1, 1])
# Visually, what we have right now is
# maximize f = x1 + x2 + x3
# color_1_red*x1 + color_2_red*x2 + color_3_red*x3 <= input_color_red
# color_1_gre*x1 + color_2_gre*x2 + color_3_gre*x3 <= input_color_gre
# color_1_blu*x1 + color_2_blu*x2 + color_3_blu*x3 <= input_color_blu
# x1 >= 0
# x2 >= 0
# x3 >= 0
# x1 <= 1
# x2 <= 1
# x3 <= 1
# As you'll notice, we have the original system of equations in our constraint
# on the system. However, we have added the constraints that the variables
# must be in the set [0,1]. We maximize the variables because linear programs
# are made simpler when the system of equations are less than or equal to.
# calculate optimal variables with constraints
res = linprog(c, A_ub=A, b_ub=b)
print(res.x)
print(res.x[0]*color_1 + res.x[1]*color_2 + res.x[2]*color_3)

Related

PyTorch: Differentiable operations to go from coordinate tensor to grid tensor

I have a tensor that looks like
coords = torch.Tensor([[0, 0, 1, 2],
[0, 2, 2, 2]])
The first row is the x-coordinates of objects on a grid and the second row is the corresponding y-coordinates.
I need a differentiable way (i.e. gradients can flow) to go from this tensor to the corresponding "grid" tensor, where a 1 represents the presence of an object in that location (row index, column index) and 0 represents no object:
grid = torch.Tensor([[1, 0, 1],
[0, 0, 1],
[0, 0, 1]])
In general, coords can be large (the grid size is 300x300). If coords was a sparse tensor I could simply call to_dense on it, but for various reasons specific to my application I cannot store coords as sparse. Additionally, I cannot create a new sparse tensor from coords and call to_dense on it because creating a new tensor is not differentiable.
Any help is appreciated!
I'm not sure what you mean by 'differentiable', but here's a simple way to do it using advanced indexing.
coords = coords.long()
grid[coords[0],coords[1]] = 1
tensor([[1., 0., 1.],
[0., 0., 1.],
[0., 0., 1.]])
I think Torch doesn't have a detailed documentation about this, but numpy has here. (probably very similar for torch)
this is also possible
coords = coords.long()
grid[coords[0],coords[1]] = torch.Tensor([1,2,3,4])
tensor([[1., 0., 2.],
[0., 0., 3.],
[0., 0., 4.]])
Say
coords = [[0, 0, 1, 2],
[0, 2, 2, 2]]
Then:
torch.stack([torch.stack(x) for x in coords])

How to correctly guess the initial points in LogLog plot linear regression?

I have 5 sets of data represented in 5 distinct colored errorbars in the following code (I have not shown caps). errorbar plot is shown in logarithmic scale in both axes. Using curvefit, I am trying to find the best linear regression passing through these errorbars. However, it seems the power-law equation I have defined to fit is not easily able to find the best-fit slope of the 5 lines. My expectation is that all 5 colored lines should be straight with negative slopes. I had hard time figuring out which starting point p0 should I specify in curve fitting process. Even with my initial hard-to-guess values, I still don't get all straight lines and some of them are too off from my points. What is the issue here?
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
x_mean = [2.81838293e+20, 5.62341325e+20, 1.12201845e+21, 2.23872114e+21, 4.46683592e+21, 8.91250938e+21, 1.77827941e+22]
mean_1 = [52., 21.33333333, 4., 1., 0., 0., 0.]
mean_2 = [57., 16.66666667, 5.66666667, 2.33333333, 0.66666667, 0., 0.33333333]
mean_3 = [67.33333333, 20., 8.66666667, 3., 0.66666667, 1., 0.33333333]
mean_4 = [79.66666667, 25., 8.33333333, 3., 1., 0., 0.]
mean_5 = [54.66666667, 16.66666667, 8.33333333, 2., 2., 1., 0.]
error_1 = [4.163332, 2.66666667, 1.15470054, 0.57735027, 0., 0., 0.]
error_2 = [4.35889894, 2.3570226, 1.37436854, 0.8819171, 0.47140452, 0., 0.33333333]
error_3 = [4.7375568, 2.5819889, 1.69967317, 1., 0.47140452, 0.57735027, 0.33333333]
error_4 = [5.15320828, 2.88675135, 1.66666667, 1., 0.57735027, 0., 0.]
error_5 = [4.26874949, 2.3570226, 1.66666667, 0.81649658, 0.81649658, 0.57735027, 0.]
newX = np.logspace(20, 22.3)
def myExpFunc(x, a, b):
return a*np.power(x, b)
popt_1, pcov_1 = curve_fit(myExpFunc, x_mean, mean_1, sigma=error_1, absolute_sigma=True, p0=(4e31,-1.5))
popt_2, pcov_2 = curve_fit(myExpFunc, x_mean, mean_2, sigma=error_2, absolute_sigma=True, p0=(4e31,-1.5))
popt_3, pcov_3 = curve_fit(myExpFunc, x_mean, mean_3, sigma=error_3, absolute_sigma=True, p0=(4e31,-1.5))
popt_4, pcov_4 = curve_fit(myExpFunc, x_mean, mean_4, sigma=error_4, absolute_sigma=True, p0=(4e31,-1.5))
popt_5, pcov_5 = curve_fit(myExpFunc, x_mean, mean_5, sigma=error_5, absolute_sigma=True, p0=(4e31,-1.5))
fig, ax1 = plt.subplots(figsize=(3,5))
ax1.errorbar(x_mean, mean_1, yerr=error_1, ecolor = 'magenta', fmt= 'mo', ms=0, elinewidth = 1, capsize = 0, capthick=0)
ax1.errorbar(x_mean, mean_2, yerr=error_2, ecolor = 'red', fmt= 'ro', ms=0, elinewidth = 1, capsize = 0, capthick=0)
ax1.errorbar(x_mean, mean_3, yerr=error_3, ecolor = 'orange', fmt= 'yo', ms=0, elinewidth = 1, capsize = 0, capthick=0)
ax1.errorbar(x_mean, mean_4, yerr=error_4, ecolor = 'green', fmt= 'go', ms=0, elinewidth = 1, capsize = 0, capthick=0)
ax1.errorbar(x_mean, mean_5, yerr=error_5, ecolor = 'blue', fmt= 'bo', ms=0, elinewidth = 1, capsize = 0, capthick=0)
ax1.plot(newX, myExpFunc(newX, *popt_1), 'm-', label='{:.2f} \u00B1 {:.2f}'.format(popt_1[1], pcov_1[1,1]**0.5))
ax1.plot(newX, myExpFunc(newX, *popt_2), 'r-', label='{:.2f} \u00B1 {:.2f}'.format(popt_2[1], pcov_2[1,1]**0.5))
ax1.plot(newX, myExpFunc(newX, *popt_3), 'y-', label='{:.2f} \u00B1 {:.2f}'.format(popt_3[1], pcov_3[1,1]**0.5))
ax1.plot(newX, myExpFunc(newX, *popt_4), 'g-', label='{:.2f} \u00B1 {:.2f}'.format(popt_4[1], pcov_4[1,1]**0.5))
ax1.plot(newX, myExpFunc(newX, *popt_5), 'b-', label='{:.2f} \u00B1 {:.2f}'.format(popt_5[1], pcov_5[1,1]**0.5))
ax1.legend(handlelength=0, loc='upper right', ncol=1, fontsize=10)
ax1.set_xlim([2e20, 3e22])
ax1.set_ylim([2e-1, 1e2])
ax1.set_xscale("log")
ax1.set_yscale("log")
plt.show()
Your numbers for X are way too enormous. Maybe you can try taking the log of both sides and fit that? Such as:
log Y = log(a) + b*log(X)
You won’t even need curve_fit at that point, it’s a standard linear regression.
EDIT
Please see my rough and not very well checked implementation (NOTE: I only have Python 2, so adjust to fit):
import numpy as np
import matplotlib.pyplot as plt
import scipy.optimize as optimize
x_mean = [2.81838293e+20, 5.62341325e+20, 1.12201845e+21, 2.23872114e+21, 4.46683592e+21, 8.91250938e+21, 1.77827941e+22]
mean_1 = [52., 21.33333333, 4., 1., 0., 0., 0.]
mean_2 = [57., 16.66666667, 5.66666667, 2.33333333, 0.66666667, 0., 0.33333333]
mean_3 = [67.33333333, 20., 8.66666667, 3., 0.66666667, 1., 0.33333333]
mean_4 = [79.66666667, 25., 8.33333333, 3., 1., 0., 0.]
mean_5 = [54.66666667, 16.66666667, 8.33333333, 2., 2., 1., 0.]
error_1 = [4.163332, 2.66666667, 1.15470054, 0.57735027, 0., 0., 0.]
error_2 = [4.35889894, 2.3570226, 1.37436854, 0.8819171, 0.47140452, 0., 0.33333333]
error_3 = [4.7375568, 2.5819889, 1.69967317, 1., 0.47140452, 0.57735027, 0.33333333]
error_4 = [5.15320828, 2.88675135, 1.66666667, 1., 0.57735027, 0., 0.]
error_5 = [4.26874949, 2.3570226, 1.66666667, 0.81649658, 0.81649658, 0.57735027, 0.]
def powerlaw(x, amp, index):
return amp * (x**index)
# define our (line) fitting function
def fitfunc(p, x):
return p[0] + p[1] * x
def errfunc(p, x, y, err):
out = (y - fitfunc(p, x)) / err
out[~np.isfinite(out)] = 0.0
return out
pinit = [1.0, -1.0]
fig = plt.figure()
ax1 = fig.add_subplot(2, 1, 1)
ax2 = fig.add_subplot(2, 1, 2)
for indx in range(1, 6):
mean = eval('mean_%d'%indx)
error = eval('error_%d'%indx)
logx = np.log10(x_mean)
logy = np.log10(mean)
logy[~np.isfinite(logy)] = 0.0
logyerr = np.array(error) / np.array(mean)
logyerr[~np.isfinite(logyerr)] = 0.0
out = optimize.leastsq(errfunc, pinit, args=(logx, logy, logyerr), full_output=1)
pfinal = out[0]
covar = out[1]
index = pfinal[1]
amp = 10.0**pfinal[0]
indexErr = np.sqrt(covar[0][0] )
ampErr = np.sqrt(covar[1][1] ) * amp
##########
# Plotting data
##########
ax1.plot(x_mean, powerlaw(x_mean, amp, index), label=u'{:.2f} \u00B1 {:.2f}'.format(pfinal[1], covar[1,1]**0.5)) # Fit
ax1.errorbar(x_mean, mean, yerr=error, fmt='k.', label='__no_legend__') # Data
ax1.set_title('Best Fit Power Law', fontsize=18, fontweight='bold')
ax1.set_xlabel('X', fontsize=14, fontweight='bold')
ax1.set_ylabel('Y', fontsize=14, fontweight='bold')
ax1.grid()
ax2.loglog(x_mean, powerlaw(x_mean, amp, index), label=u'{:.2f} \u00B1 {:.2f}'.format(pfinal[1], covar[1,1]**0.5))
ax2.errorbar(x_mean, mean, yerr=error, fmt='k.', label='__no_legend__') # Data
ax2.set_xlabel('X (log scale)', fontsize=14, fontweight='bold')
ax2.set_ylabel('Y (log scale)', fontsize=14, fontweight='bold')
ax2.grid(b=True, which='major', linestyle='--', color='darkgrey')
ax2.grid(b=True, which='minor', linestyle=':', color='grey')
ax1.legend()
ax2.legend()
plt.show()
Picture:

How to cover a label list under the multi-label classification context into one-hot encoding with pytorch?

I have a list of one batch data with multi-label for every sample. So how to covert it into torch.Tensor in one-hot encoding?
For example, with batch_size=5 and class_num=6,
label =[
[1,2,3],
[4,6],
[1],
[1,4,5],
[4]
]
how to make it into one-hot encoding in pytorch?
label_tensor=tensor([
[1,1,1,0,0,0],
[0,0,0,1,0,1],
[1,0,0,0,0,0],
[1,0,0,1,1,0],
[0,0,0,1,0,0]
])
If the batch size can be derived from len(labels):
def to_onehot(labels, n_categories, dtype=torch.float32):
batch_size = len(labels)
one_hot_labels = torch.zeros(size=(batch_size, n_categories), dtype=dtype)
for i, label in enumerate(labels):
# Subtract 1 from each LongTensor because your
# indexing starts at 1 and tensor indexing starts at 0
label = torch.LongTensor(label) - 1
one_hot_labels[i] = one_hot_labels[i].scatter_(dim=0, index=label, value=1.)
return one_hot_labels
and you have 6 categories and want the output to be a tensor of integers:
to_onehot(labels, n_categories=6, dtype=torch.int64)
tensor([[1, 1, 1, 0, 0, 0],
[0, 0, 0, 1, 0, 1],
[1, 0, 0, 0, 0, 0],
[1, 0, 0, 1, 1, 0],
[0, 0, 0, 1, 0, 0]])
I would stick to torch.float32 in case you want to use label smoothing, mix-up or something along those lines later.
To handle any situation (include string labels) I've extended #karniol's answer:
def multihot_encoder(labels, dtype=torch.float32):
""" Convert list of label lists into a 2-D multihot Tensor """
label_set = set()
for label_list in labels:
label_set = label_set.union(set(label_list))
label_set = sorted(label_set)
multihot_vectors = []
for label_list in labels:
multihot_vectors.append([1 if x in label_list else 0 for x in label_set])
# To keep track of which columns are which, set dtype to None and...
# import pandas as pd
if dtype is None:
return pd.DataFrame(multihot_vectors, columns=label_set)
return torch.Tensor(multihot_vectors).to(dtype)
Your use case:
label_lists = [[1,2,3], [4,6], [1], [1,4,5], [4]]
>>> multihot_encoder(label_lists)
tensor([[1., 1., 1., 0., 0., 0.],
[0., 0., 0., 1., 0., 1.],
[1., 0., 0., 0., 0., 0.],
[1., 0., 0., 1., 1., 0.],
[0., 0., 0., 1., 0., 0.]])
If you want to keep track of your labels (feature names) before converting your dataset to a Tensor, just set dtype to None:
label_lists = [
['happy', 'kind'], ['sad', 'mean'],
['loud', 'happy'], ['quiet', 'kind']
]
multihot_encoder(label_lists, dtype=None)
happy kind loud mean quiet sad
0 1 1 0 0 0 0
1 0 0 0 1 0 1
2 1 0 1 0 0 0
3 0 1 0 0 1 0
and you have 6 categories and want the output to be a tensor of integers:
to_onehot(labels, n_categories=6, dtype=torch.int64)
tensor([[1, 1, 1, 0, 0, 0],
[0, 0, 0, 1, 0, 1],
[1, 0, 0, 0, 0, 0],
[1, 0, 0, 1, 1, 0],
[0, 0, 0, 1, 0, 0]])

Find Up-Vector for LookAt Matrix (2 rotations)

I'm trying to establish a lookAt matrix to look at the point {0,0,0} from an eye at {5,5,10}. I've successfully implemented the LookAt matrix for an eye at {5,0,10}, but the second angle (y direction) throws off my 'up' vector so it isn't {0,1,0} anymore. Can someone help me determine what my 'up' vector should be in terms of my eye matrix? I'd like to keep the x axis parallel with the horizon.
I'm using Mathematica, but am going to implement in Python...
Below is the code for using an eye at {5,0,10}, but I'd like to change the eye to {5,5,10}, and figure out the correct 'up' vector
`eye = {5, 5, 10};
l = {0, 0, 0};
Mt = IdentityMatrix[4];
F = l - eye;
Mt[[1 ;; 3, 4]] = -eye;
>>>Mt={{1, 0, 0, -5}, {0, 1, 0, 0}, {0, 0, 1, -10}, {0, 0, 0, 1}}
forward = Normalize[F];
up = Normalize[{0, 1, 0}];
left = Cross[up, forward];
Mr = IdentityMatrix[4];
Mr[[1, 1 ;; 3]] = left;
Mr[[2, 1 ;; 3]] = up;
Mr[[3, 1 ;; 3]] = forward;
>>> Mr={{-0.894427, 0., 0.447214, 0.}, {0., 1., 0., 0.}, {-0.447214,
0., -0.894427, 0.}, {0., 0., 0., 1.}}
>>>Mr.Mt={{-0.894427, 0., 0.447214, 0.}, {0., 1., 0., 0.}, {-0.447214,
0., -0.894427, 11.1803}, {0., 0., 0., 1.}}`
You need to re-orthogonalize your up-vector:
up = Cross[forward, left];

Thresholding in Theano

Is there a way to threshold the values in a Theano tensor? For instance, if v = t.vector(), I would like to create another tensor w which contains the same values as v, except that the ones that exceed a certain threshold T are replaced by T itself:
v = [1, 2, 3, 100, 200, 300]
T = 100
w = [1, 2, 3, 100, 100, 100]
More generally, what is there a standard framework to create your own operations on tensors?
Here is code that do that. Use the clip function.
import theano
v = theano.tensor.vector()
f = theano.function([v], theano.tensor.clip(v, 0, 100))
f([1, 2, 3, 100, 200, 300])
# array([ 1., 2., 3., 100., 100., 100.])
If you you don't want a min you can use a switch:
import theano
v = theano.tensor.vector()
f = theano.function([v], theano.tensor.clip(v, 0, 100))
f([1, 2, 3, 100, 200, 300])
# array([ 1., 2., 3., 100., 100., 100.])
f = theano.function([v], theano.tensor.switch(v<100, v, 100))
f([1, 2, 3, 100, 200, 300])
# array([ 1., 2., 3., 100., 100., 100.])

Resources