**I. ADASYN for Imbalanced Learning: **

This website includes the algorithms, demos, and source code implementation of the Adaptive Synthetic Sampling approach (ADASYN) for imbalanced learning, as presented in our original paper [1]. The essential idea of ADASYN is to use a weighted distribution for different minority class examples according to their level of difficulty in learning, where more synthetic data is generated for minority class examples that are harder to learn compared to those minority examples that are easier to learn. As a result, the ADASYN approach improves learning with respect to the data distributions in two ways: (1) reducing the bias introduced by the class imbalance, and (2) adaptively shifting the classification decision boundary toward the difficult examples.

*ADASYN Algorithm: *

**Input**

(1) Training data set with samples. Define and as the number of minority class examples and the number of
majority class examples, respectively. Therefore, and .

**Procedure**

(1) Calculate the number of
synthetic data examples that need to be generated for the minority class:

where is a parameter used to
specify the desired balance level after generation of the synthetic data. means
a fully balanced data set is created after the generation process;

(2) For each example , find K nearest neighbors based on the Euclidean distance in
dimensional space, and
calculate the ratio defined as:

where is the number of
examples in the K nearest neighbors of that belongs to the
majority class, therefore ;

(3) Normalize according to , so that is a density
distribution ;

(4) Calculate the number of
synthetic data examples that need to be generated for each minority example :

(5) For each minority example , generate synthetic data
examples according to the following two steps:

(i) Randomly choose one
minority data example from the K nearest
neighbors for data .

(ii) Generate the synthetic data example:

where is the difference
vector in dimensional spaces,
and is a random number: .

**III. Demos &
Source Code:**

* Demo 1:*
Given the training data with class labels, generate synthetic minority class data.

*Function: [AdaSYNData, AdaSYNLabel] = ADASYN(TrainingData, TrainingLabel, beta , kNN)*

This function returns the generated
synthetic minority class data using our ADASYN approach according to a
specified balance level *beta*. *TrainingData* is a matrix where is the total
number of training data and is the feature
dimensions. *TrainingLabel* is a class label
vector for *TrainingData*. *kNN* is an integer representing the number of
nearest neighbors under consideration.

*Source Code: You can download the source
code from here*

**Reference**

The software package and examples provided here are associated with our following paper. If you are considering to use this algorithm in your research/work, please cite and refer to our following paper:

[1] H. He, Y. Bai, E. A. Garcia, and S. Li, "ADASYN:
Adaptive synthetic sampling approach for imbalanced learning." in Proc. IEEE Int. Joint Conf. Neural Networks (IJCNN'18), pp. 1322-1328, 2008. [PDF].