apache-ignite

Форк
0
63 строки · 2.9 Кб
1
// Licensed to the Apache Software Foundation (ASF) under one or more
2
// contributor license agreements.  See the NOTICE file distributed with
3
// this work for additional information regarding copyright ownership.
4
// The ASF licenses this file to You under the Apache License, Version 2.0
5
// (the "License"); you may not use this file except in compliance with
6
// the License.  You may obtain a copy of the License at
7
//
8
// http://www.apache.org/licenses/LICENSE-2.0
9
//
10
// Unless required by applicable law or agreed to in writing, software
11
// distributed under the License is distributed on an "AS IS" BASIS,
12
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13
// See the License for the specific language governing permissions and
14
// limitations under the License.
15
= k-NN Regression
16

17
The Apache Ignite Machine Learning component provides two versions of the widely used k-NN (k-nearest neighbors) algorithm - one for classification tasks and the other for regression tasks.
18

19
This documentation reviews k-NN as a solution for regression tasks.
20

21
== Trainer and Model
22

23
The k-NN regression algorithm is a non-parametric method whose input consists of the k-closest training examples in the feature space. Each training example has a property value in a numerical form associated with the given training example.
24

25
The k-NN regression  algorithm uses all training sets to predict a property value for the given test sample.
26
This predicted property value is an average of the values of its k nearest neighbors. If `k` is `1`, then the test sample is simply assigned to the property value of a single nearest neighbor.
27

28
Presently, Ignite supports a few parameters for k-NN regression algorithm:
29

30
* `k` - a number of nearest neighbors
31
* `distanceMeasure` - one of the distance metrics provided by the ML framework such as Euclidean, Hamming or Manhattan
32
* `isWeighted` - false by default, if true it enables a weighted KNN algorithm.
33
* `dataCache` -  holds a training set of objects for which the class is already known.
34
* `indexType` - distributed spatial index, has three values: ARRAY, KD_TREE, BALL_TREE
35

36

37
[source, java]
38
----
39
// Create trainer
40
KNNRegressionTrainer trainer = new KNNRegressionTrainer()
41
  .withK(5)
42
  .withIdxType(SpatialIndexType.BALL_TREE)
43
  .withDistanceMeasure(new ManhattanDistance())
44
  .withWeighted(true);
45

46
// Train model.
47
KNNClassificationModel knnMdl = trainer.fit(
48
  ignite,
49
  dataCache,
50
  vectorizer
51
);
52

53
// Make a prediction.
54
double prediction = knnMdl.predict(observation);
55
----
56

57

58
== Example
59

60

61
To see how kNN Regression can be used in practice, try this https://github.com/apache/ignite/blob/master/examples/src/main/java/org/apache/ignite/examples/ml/knn/KNNRegressionExample.java[example^] that is available on GitHub and delivered with every Apache Ignite distribution.
62

63
The training dataset is the Iris dataset which can be loaded from the https://archive.ics.uci.edu/ml/datasets/iris[UCI Machine Learning Repository^].
64

Использование cookies

Мы используем файлы cookie в соответствии с Политикой конфиденциальности и Политикой использования cookies.

Нажимая кнопку «Принимаю», Вы даете АО «СберТех» согласие на обработку Ваших персональных данных в целях совершенствования нашего веб-сайта и Сервиса GitVerse, а также повышения удобства их использования.

Запретить использование cookies Вы можете самостоятельно в настройках Вашего браузера.