With sklearn, I am trying to model a pickup and dropoff vehicle routing problem. If you can recommend one of classifiers, it will be appreciated. For a simplicity, there is one vehicle and there are 5 customers. The training data has 20 features and 10 outputs.
Features include the x-y cords of 5 customers. Each customer has pickup and dropoff locations.
c1p_x, c1p_y,c2p_x, c2p_y,c3p_x, c3p_y,c4p_x, c4p_y,c5p_x, c5p_y,
c1d_x, c1d_y,c2d_x, c2d_y,c3d_x, c3d_y,c4d_x, c4d_y,c5d_x, c5d_y,
c1p_x, c1p_y: customer 1 pickup x-y cord.
c1d_x: c1d_y: customer 1 dropoff x-y cord.
For example,
123,106,332,418,106,477,178,363,381,349,54,214,297,34,5,122,3,441,455,322
Outputs include the optimal sequence of visit.For example, 5,10,2,7,1,6,4,9,3,8
Customer 5 (pkup) => 10 (drop) => 2 (pkup) => 7 (drop) ... => 8 (drop)
Note each pickup will be immediately followed by dropoff.
Here are codes I tried.
import numpy as np
import pandas as pd
from sklearn.neural_network import MLPClassifier
train = pd.read_csv('ML_DARP_train.txt',header=None,sep=',')
print (train.head())
x = train[range(0,19)]
y = train[range(20,30)]
classifier = MLPClassifier(solver='lbfgs', alpha=1e-5, hidden_layer_sizes=(15,), random_state=1)
MLPClassifier(activation='relu', alpha=1e-05, batch_size='auto',
beta_1=0.9, beta_2=0.999, early_stopping=False,
epsilon=1e-08, hidden_layer_sizes=(15,),
learning_rate='constant', learning_rate_init=0.001,
max_iter=200, momentum=0.9, n_iter_no_change=10,
nesterovs_momentum=True, power_t=0.5, random_state=1,
shuffle=True, solver='lbfgs', tol=0.0001,
validation_fraction=0.1, verbose=False, warm_start=False)
classifier.fit(x, y)
print(classifier.score(x, y))
test = pd.read_csv('ML_DARP_test.txt',header=None,sep=',')
test = test[range(0,19)]
print (classifier.predict(test))
import numpy as np
from sklearn.ensemble import RandomForestClassifier
from sklearn.multioutput import MultiOutputClassifier
import pandas as pd
train = pd.read_csv('ML_DARP_train.txt',header=None,sep=',')
print (train.head())
x = train[range(0,19)]
y = train[range(20,30)]
print (y)
forest = RandomForestClassifier(n_estimators=100, random_state=0)
classifier = MultiOutputClassifier(forest, n_jobs=-1)
classifier.fit(x, y)
print(classifier.score(x, y))
test = pd.read_csv('ML_DARP_test.txt',header=None,sep=',')
test = test[range(0,19)]
print (classifier.predict(test))
Here are training data.
123,106,332,418,106,477,178,363,381,349,54,214,297,34,5,122,3,441,455,322,5,10,2,7,1,6,4,9,3,8
154,129,466,95,135,191,243,13,289,227,300,40,171,286,219,403,232,113,378,428,5,10,2,7,1,6,4,9,3,8
215,182,163,321,259,500,434,304,355,276,77,414,93,83,42,292,101,459,488,237,5,10,4,9,3,8,2,7,1,6
277,220,313,29,304,229,500,454,263,154,339,255,484,351,287,87,330,147,411,343,1,6,3,8,2,7,4,9,5,10
308,258,464,223,349,460,64,120,188,62,100,96,374,118,16,368,73,352,365,480,2,7,1,6,5,10,3,8,4,9
369,296,97,385,363,174,161,317,128,472,346,423,217,338,246,163,349,87,335,132,2,7,4,9,1,6,5,10,3,8
400,318,263,94,471,467,321,45,146,475,107,264,139,136,53,36,155,370,382,380,3,8,2,7,4,9,5,10,1,6
477,387,461,350,62,244,417,242,102,399,401,137,76,451,330,364,431,90,368,47,3,8,1,6,4,9,2,7,5,10
38,441,95,12,45,412,452,361,496,276,162,479,420,155,12,112,128,263,290,138,4,9,1,6,3,8,2,7,5,10
69,447,245,205,106,157,79,89,467,216,393,289,311,422,273,440,435,30,291,323,2,7,4,9,3,8,1,6,5,10
115,0,427,430,214,451,207,302,439,172,185,178,232,220,64,282,210,266,292,22,2,7,5,10,1,6,3,8,4,9
192,53,92,123,259,180,273,468,363,81,447,19,122,488,310,77,454,471,246,159,3,8,1,6,5,10,2,7,4,9
223,91,227,317,304,411,385,180,319,5,208,361,498,239,54,389,245,222,231,328,5,10,2,7,4,9,1,6,3,8
269,113,424,57,396,188,12,378,322,493,470,218,435,52,331,231,20,474,263,59,4,9,5,10,1,6,2,7,3,8
315,151,42,204,410,387,78,43,215,355,215,28,278,273,44,11,264,178,170,149,5,10,2,7,3,8,4,9,1,6
393,236,239,444,487,148,191,240,202,326,23,417,200,71,321,338,39,414,203,365,4,9,2,7,3,8,1,6,5,10
454,274,390,153,62,410,303,453,173,266,286,259,106,354,96,165,331,165,203,48,4,9,2,7,3,8,5,10,1,6
15,327,86,378,154,187,447,181,160,237,62,131,27,152,389,23,137,448,220,264,3,8,4,9,2,7,1,6,5,10
61,365,237,71,184,417,43,379,131,178,324,474,403,388,133,334,413,167,205,417,5,10,3,8,1,6,4,9,2,7
123,418,403,280,261,178,124,59,56,86,101,331,309,170,394,145,172,404,175,70,3,8,2,7,5,10,4,9,1,6
169,441,36,458,275,378,190,194,466,465,332,141,167,406,108,426,385,76,97,176,3,8,2,7,1,6,5,10,4,9
215,494,249,213,398,186,365,470,500,483,124,29,120,236,431,315,238,391,161,439,3,8,5,10,4,9,1,6,2,7
246,0,337,345,396,370,399,88,377,329,355,324,449,441,113,48,420,32,68,28,2,7,3,8,1,6,5,10,4,9
339,100,49,84,489,131,496,286,317,253,163,213,370,238,390,375,195,268,37,181,5,10,2,7,3,8,4,9,1,6
385,122,215,294,80,424,139,29,320,240,425,70,292,36,181,233,17,50,70,413,2,7,4,9,1,6,5,10,3,8
416,144,366,2,125,154,236,211,291,180,170,396,182,304,427,44,277,286,86,112,5,10,3,8,2,7,4,9,1,6
477,198,15,180,170,384,348,424,231,105,448,253,41,39,171,356,68,22,56,265,1,6,4,9,2,7,5,10,3,8
38,251,181,390,231,130,414,58,171,13,209,95,448,322,417,151,281,211,10,387,2,7,5,10,1,6,3,8,4,9
69,258,347,98,276,360,495,239,111,439,455,437,354,89,161,447,40,447,480,55,3,8,1,6,4,9,5,10,2,7
131,327,28,323,384,153,169,500,130,426,248,309,275,388,469,321,379,229,27,286,1,6,5,10,2,7,4,9,3,8
161,334,116,454,351,305,172,102,492,272,463,88,87,76,120,37,60,387,419,361,1,6,5,10,3,8,4,9,2,7
238,403,313,194,428,82,300,331,479,227,255,462,9,375,412,396,367,154,420,45,2,7,1,6,4,9,5,10,3,8
285,456,10,419,35,375,460,74,498,230,47,350,447,189,219,254,189,437,484,308,2,7,5,10,4,9,1,6,3,8
346,494,144,112,95,120,71,287,469,170,294,176,322,441,480,81,480,188,469,476,3,8,2,7,4,9,5,10,1,6
393,47,310,322,156,367,168,453,378,63,71,33,227,223,240,393,208,377,423,113,5,10,1,6,3,8,4,9,2,7
470,100,22,77,264,159,312,197,380,34,364,406,180,52,47,266,30,159,440,329,4,9,2,7,5,10,1,6,3,8
15,138,157,239,294,358,362,332,289,429,109,248,23,273,246,14,243,348,393,466,5,10,4,9,3,8,2,7,1,6
61,176,323,449,355,119,490,59,261,384,371,89,430,39,22,357,49,99,394,134,2,7,4,9,1,6,3,8,5,10
92,198,427,110,353,303,39,210,201,293,117,400,274,260,220,121,278,304,348,272,1,6,4,9,5,10,3,8,2,7
185,267,154,367,476,111,183,439,172,249,410,288,226,89,27,496,84,70,365,488,5,10,4,9,1,6,3,8,2,7
231,321,305,59,36,357,264,119,112,157,187,145,117,357,288,291,345,291,319,124,4,9,1,6,5,10,2,7,3,8
262,327,440,238,66,71,345,270,36,66,417,456,477,92,17,86,72,480,273,262,1,6,2,7,4,9,3,8,5,10
323,381,89,431,96,287,411,436,477,476,179,298,367,359,231,367,317,200,242,415,1,6,3,8,2,7,5,10,4,9
369,418,286,171,219,94,85,195,10,494,456,170,304,173,54,256,154,499,321,192,1,6,2,7,3,8,4,9,5,10
416,456,437,365,249,325,166,377,452,403,217,12,179,425,299,51,415,218,275,330,3,8,2,7,4,9,5,10,1,6
477,9,86,42,294,55,248,58,360,296,479,354,53,176,28,347,158,423,214,452,3,8,2,7,4,9,1,6,5,10
7,31,237,251,355,301,360,255,332,236,240,179,460,459,289,158,434,158,214,151,2,7,3,8,4,9,5,10,1,6
84,84,418,476,447,78,473,468,319,207,17,52,381,241,65,0,225,426,231,335,5,10,4,9,2,7,3,8,1,6
146,154,83,169,477,277,22,102,180,37,295,410,256,493,279,265,438,83,107,395,4,9,5,10,2,7,1,6,3,8
177,176,234,363,21,7,134,299,183,24,40,236,131,244,24,76,213,334,155,125,4,9,2,7,1,6,3,8,5,10
238,214,416,87,144,332,309,74,201,27,318,108,68,58,347,466,66,148,202,372,1,6,2,7,4,9,5,10,3,8
316,298,128,344,236,108,438,287,173,468,126,498,5,372,154,340,357,400,188,40,4,9,1,6,5,10,3,8,2,7
346,305,215,475,219,276,441,391,50,330,341,292,334,76,337,57,38,41,126,162,3,8,1,6,4,9,5,10,2,7
408,358,397,199,296,37,68,103,37,285,118,133,239,359,97,384,330,309,127,346,3,8,5,10,1,6,4,9,2,7
439,381,47,377,357,284,180,316,494,225,380,491,114,110,358,211,120,44,112,30,1,6,3,8,5,10,2,7,4,9
15,450,228,117,449,60,309,28,465,165,156,348,51,425,150,69,412,311,97,199,3,8,4,9,5,10,2,7,1,6
77,2,363,295,463,260,358,163,358,43,434,205,411,160,348,318,124,469,35,305,5,10,1,6,3,8,2,7,4,9
123,40,59,19,23,21,487,392,345,500,211,62,317,428,124,160,416,236,36,4,5,10,3,8,1,6,2,7,4,9
169,62,194,197,83,267,83,72,317,455,441,389,207,194,385,472,175,472,37,189,2,7,1,6,5,10,4,9,3,8
231,131,376,438,160,28,164,254,241,348,234,261,129,493,145,298,435,176,492,326,3,8,5,10,2,7,4,9,1,6
277,154,25,115,221,274,276,452,213,304,480,87,3,244,406,109,210,428,493,10,3,8,2,7,4,9,5,10,1,6
323,191,176,309,266,489,358,132,153,213,241,429,394,11,135,405,470,147,463,179,5,10,1,6,3,8,2,7,4,9
354,214,311,487,296,203,423,283,46,90,488,255,253,247,349,185,198,336,401,300,3,8,1,6,4,9,5,10,2,7
447,298,7,211,372,481,66,11,64,93,296,143,175,45,141,43,4,103,449,31,3,8,4,9,2,7,5,10,1,6
493,336,157,420,449,242,178,224,35,33,41,470,65,312,417,370,296,370,434,200,5,10,3,8,1,6,2,7,4,9
7,343,323,129,40,19,291,421,477,458,287,311,488,110,193,197,71,90,404,369,2,7,5,10,4,9,3,8,1,6
69,396,474,307,54,218,372,102,432,382,64,168,346,346,407,477,331,326,389,21,1,6,3,8,5,10,4,9,2,7
146,465,155,31,115,465,469,284,357,291,342,25,252,113,167,304,90,30,343,158,3,8,1,6,5,10,4,9,2,7
192,2,305,240,191,226,96,11,375,278,103,368,158,396,444,146,397,313,375,390,3,8,1,6,4,9,5,10,2,7
238,24,471,450,268,488,209,209,315,202,365,209,64,178,220,474,156,33,360,58,3,8,2,7,4,9,1,6,5,10
285,78,105,111,282,202,274,359,224,80,126,50,424,415,434,238,385,222,298,179,5,10,3,8,1,6,2,7,4,9
346,116,286,336,358,464,387,71,195,35,388,393,330,197,210,80,176,474,299,364,1,6,3,8,5,10,2,7,4,9
424,185,484,92,466,256,46,316,229,38,196,281,283,26,17,454,499,272,347,110,3,8,4,9,2,7,5,10,1,6
470,223,133,270,11,471,111,482,138,432,442,122,157,278,231,218,242,476,301,248,3,8,2,7,4,9,5,10,1,6
15,261,284,464,56,201,208,163,94,357,203,465,32,29,477,29,1,196,271,401,4,9,1,6,5,10,2,7,3,8
46,267,418,141,70,400,258,313,2,250,434,275,392,265,190,310,230,401,209,6,4,9,3,8,5,10,1,6,2,7
107,336,130,381,193,224,417,72,5,237,242,178,345,79,12,183,68,183,257,253,2,7,4,9,1,6,3,8,5,10
154,358,234,43,176,392,452,176,415,114,473,474,188,284,195,417,250,341,179,359,5,10,3,8,1,6,4,9,2,7
200,412,400,252,252,153,79,405,370,54,250,331,94,66,472,259,56,92,165,27,1,6,4,9,5,10,3,8,2,7
277,465,81,462,298,383,129,38,295,448,26,188,485,334,201,54,269,281,134,196,3,8,4,9,1,6,2,7,5,10
339,18,262,186,405,176,304,314,313,451,304,60,406,131,8,429,122,94,182,428,4,9,1,6,2,7,3,8,5,10
370,56,429,411,498,453,432,26,269,375,65,403,328,430,300,271,398,331,152,80,1,6,5,10,4,9,2,7,3,8
431,94,78,88,11,152,466,161,162,252,327,244,187,166,499,19,110,3,74,186,5,10,3,8,1,6,4,9,2,7
462,116,228,282,71,414,94,390,180,239,72,70,93,449,274,362,417,286,122,433,2,7,4,9,1,6,5,10,3,8
38,185,410,6,164,190,237,102,136,179,366,459,500,231,66,220,208,22,107,101,5,10,4,9,2,7,3,8,1,6
100,238,75,215,225,421,319,284,92,104,142,300,390,499,311,15,468,258,77,238,3,8,5,10,1,6,2,7,4,9
131,245,179,362,192,73,337,387,454,451,358,95,233,203,478,249,149,400,485,329,1,6,3,8,5,10,4,9,2,7
177,298,392,118,331,397,27,178,3,469,150,484,186,32,317,154,18,229,47,91,3,8,2,7,1,6,5,10,4,9
254,352,57,311,392,143,108,344,460,409,428,341,76,299,61,450,262,450,48,275,3,8,4,9,1,6,5,10,2,7
285,390,191,489,406,358,174,9,369,302,173,167,436,35,291,229,6,138,2,413,5,10,2,7,3,8,1,6,4,9
362,443,373,229,482,119,318,253,387,289,451,23,357,334,67,71,329,437,34,143,4,9,1,6,3,8,2,7,5,10
408,481,54,439,105,428,462,466,358,245,227,397,279,131,375,446,119,188,35,328,5,10,3,8,1,6,4,9,2,7
470,34,204,131,134,142,42,147,283,138,489,238,154,383,104,241,364,393,475,450,5,10,1,6,2,7,3,8,4,9
15,71,355,325,164,357,92,282,176,15,250,64,28,134,318,5,76,65,413,71,3,8,5,10,2,7,4,9,1,6
61,94,4,18,225,102,204,495,178,488,12,406,435,417,78,317,368,333,429,286,1,6,5,10,3,8,2,7,4,9
123,163,202,259,348,411,364,254,181,475,289,279,357,215,402,206,205,115,462,487,2,7,4,9,1,6,5,10,3,8
185,201,336,437,362,125,429,405,90,352,50,120,231,452,115,487,434,320,400,107,1,6,5,10,3,8,2,7,4,9
231,238,17,145,407,356,41,117,77,323,312,463,121,218,360,298,225,70,432,323,1,6,4,9,2,7,5,10,3,8
262,276,167,355,484,101,138,298,1,216,73,320,27,0,120,108,485,291,370,461,1,6,4,9,3,8,2,7,5,10
292,283,302,32,44,347,203,449,442,141,304,130,403,252,366,420,213,480,356,129,4,9,3,8,1,6,2,7,5,10
Sklearn is built for generic algorithms, TSP/VRP are too specific for it. Are you open to trying more specific libraries then Sklearn?
Recent advance in Reinforcement Learning seems to address TSP and VRP problems in a way that challenges the traditional Combinatorial Optimization approach.
To start with, you can look at this tutorial.
A recent paper shows a method for VRP. They also shared their code on Github.
A more recent paper claims to have a shorter training period.
Generally speaking, the architecture proposed in these papers looks on the VRP job as a whole and is better than a greedy approach by:
The training phase which goes back and forth to include future
rewards
The solution architecture includes (at least) two NN.
Encoder and Decoder. The Encoder goes thru the entire input BEFORE the Decoder starts producing the output
To summarize, if you want a quick and robust solution you can use existing open libraries such as Jsprit. If you have time for research, the resources for training a NN and can take the risk of failing, go after Reinforcement Learning.
Based on your comments, using ML purely to generate a starting point for a traditional MIP/constraint/heuristic solver is a better idea than using ML to solve the whole thing, but I believe it to still be a bad idea. In my opinion, you will find it very hard to get a useful initial solution using ML. In a few lines of code you could probably put together a heuristic to greedily grow routes for a search starting point; getting ML to do something of even roughly equivalent quality would be a lot more work, and maybe not even possible.
If you really wanted to try this (and I emphasize again that it's a bad idea), the choice of features is likely much more important than the choice of classifier. For example at the moment you're asking the classifier to learn both (a) pythagoras and then (b) what's a good route. It has to learn pythagoras because you're passing in the coords directly. ML works best when the features are engineered to make the learning task easier. Passing in a normalised distance matrix instead of the raw coords might be more succesful, because then the classifier doesn't have to learn pythagoras. However, then you have n^2 scaling in features, which would likely cause overfitting and the problems associated with that...
Alternative you could grow the route from empty using ML to decide the next stop to add each time. So ML classifier chooses the first stop, then you classify again to choose the second stop, then classify again to choose the third and so-on. This would be simpler too, though the ML will primarily just learn 'what the closest stop is to the last one'. I have known some companies to use this kind of 'ML chooses the next stop or job' approach when they're scheduling/dispatching jobs one at a time for food takeaway/on-demand deliveries - i.e. problems similar to Uber Eats delivering hot food from restaurants. This is a bit different from your case, as it's dynamic/realtime route optimisation problem, but still some companies are actually using ML in vehicle route optimisation for real. In my option it's still a bad approach though - e.g. we did a study in this video https://www.youtube.com/watch?v=EMhnXAH5dvM where we look at the effect of this kind of one-at-a-time scheduling/dispatching (which you can use ML for) vs proper route optimisation, and one-at-a-time scheduling/dispatching comes off significantly worse.