apache-ignite

Форк
0
106 строк · 3.9 Кб
1
// Licensed to the Apache Software Foundation (ASF) under one or more
2
// contributor license agreements.  See the NOTICE file distributed with
3
// this work for additional information regarding copyright ownership.
4
// The ASF licenses this file to You under the Apache License, Version 2.0
5
// (the "License"); you may not use this file except in compliance with
6
// the License.  You may obtain a copy of the License at
7
//
8
// http://www.apache.org/licenses/LICENSE-2.0
9
//
10
// Unless required by applicable law or agreed to in writing, software
11
// distributed under the License is distributed on an "AS IS" BASIS,
12
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13
// See the License for the specific language governing permissions and
14
// limitations under the License.
15
= Evaluator
16

17
Apache Ignite ML comes with a number of machine learning algorithms that can be used to learn from and make predictions on data. When these algorithms are applied to build machine learning models, there is a need to evaluate the performance of the model on some criteria, which depends on the application and its requirements. Apache Ignite ML also provides a suite of classification and regression metrics for the purpose of evaluating the performance of machine learning models.
18

19
== Classification model evaluation
20

21
While there are many different types of classification algorithms, the evaluation of classification models all share similar principles. In a supervised classification problem, there exists a true output and a model-generated predicted output for each data point. For this reason, the results for each data point can be assigned to one of four categories:
22

23
* True Positive (TP) - label is positive and prediction is also positive
24
* True Negative (TN) - label is negative and prediction is also negative
25
* False Positive (FP) - label is negative but prediction is positive
26
* False Negative (FN) - label is positive but prediction is negative
27

28
Especially, these metrics are important for binary classification.
29

30
CAUTION: Multiclass classification evalution is not supported yet in Apache Ignite ML.
31

32
The full list of binary classification metrics supported in Apache Ignite ML is next:
33

34
* Accuracy
35
* Balanced accuracy
36
* F-Measure
37
* FallOut
38
* FN
39
* FP
40
* FDR
41
* MissRate
42
* NPV
43
* Precision
44
* Recall
45
* Specificity
46
* TN
47
* TP
48

49
The explanation and formulas for these metrics can be found https://en.wikipedia.org/wiki/Evaluation_of_binary_classifiers[here].
50

51

52
[source, java]
53
----
54
// Define the vectorizer.
55
Vectorizer<Integer, Vector, Integer, Double> vectorizer = new DummyVectorizer<Integer>()
56
   .labeled(Vectorizer.LabelCoordinate.FIRST);
57

58
// Define the trainer.
59
SVMLinearClassificationTrainer trainer = new SVMLinearClassificationTrainer();
60

61
// Train the model.
62
SVMLinearClassificationModel mdl = trainer.fit(ignite, dataCache, vectorizer);
63

64
// Calculate all classification metrics.
65
EvaluationResult res = Evaluator
66
  .evaluateBinaryClassification(dataCache, mdl, vectorizer);
67

68
double accuracy = res.get(MetricName.ACCURACY)
69
----
70

71

72
== Regression model evaluation
73

74
Regression analysis is used when predicting a continuous output variable from a number of independent variables.
75

76
The full list of regression metrics supported in Apache Ignite ML is as follows:
77

78
* MAE
79
* R2
80
* RMSE
81
* RSS
82
* MSE
83

84

85
[source, java]
86
----
87
// Define the vectorizer.
88
Vectorizer<Integer, Vector, Integer, Double> vectorizer = new DummyVectorizer<Integer>()
89
   .labeled(Vectorizer.LabelCoordinate.FIRST);
90

91
// Define the trainer.
92
KNNRegressionTrainer trainer = new KNNRegressionTrainer()
93
    .withK(5)
94
    .withDistanceMeasure(new ManhattanDistance())
95
    .withIdxType(SpatialIndexType.BALL_TREE)
96
    .withWeighted(true);
97

98
// Train the model.
99
KNNRegressionModel knnMdl = trainer.fit(ignite, dataCache, vectorizer);
100

101
// Calculate all classification metrics.
102
EvaluationResult res = Evaluator
103
  .evaluateRegression(dataCache, mdl, vectorizer);
104

105
double mse = res.get(MetricName.MSE);
106
----
107

108

Использование cookies

Мы используем файлы cookie в соответствии с Политикой конфиденциальности и Политикой использования cookies.

Нажимая кнопку «Принимаю», Вы даете АО «СберТех» согласие на обработку Ваших персональных данных в целях совершенствования нашего веб-сайта и Сервиса GitVerse, а также повышения удобства их использования.

Запретить использование cookies Вы можете самостоятельно в настройках Вашего браузера.