Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

「OSPP - KubeEdge SIG AI」Implementation of a Class Incremental Learning Algorithm Evaluation System based on Ianvs #82

Closed
wants to merge 31 commits into from
Closed
Show file tree
Hide file tree
Changes from 10 commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
b9ac386
new file: docs/proposals/algorithms/lifelong-learning/Implementati…
qxygxt Aug 1, 2023
20a7c93
modified: docs/proposals/algorithms/lifelong-learning/Implementati…
qxygxt Aug 1, 2023
5fc7014
modified: docs/proposals/algorithms/lifelong-learning/Implementati…
qxygxt Aug 1, 2023
60e4620
modified: docs/proposals/algorithms/lifelong-learning/Implementati…
qxygxt Aug 2, 2023
eb5f808
modified: docs/proposals/algorithms/lifelong-learning/Implementati…
qxygxt Aug 2, 2023
897e4aa
modified: docs/proposals/algorithms/lifelong-learning/Implementati…
qxygxt Aug 2, 2023
20a308b
modified: docs/proposals/algorithms/lifelong-learning/Implementati…
qxygxt Aug 2, 2023
a7e2f77
Update Implementation of a Class Incremental Learning Algorithm Evalu…
qxygxt Aug 3, 2023
c5b06ed
modified: docs/proposals/algorithms/lifelong-learning/Implementati…
qxygxt Aug 3, 2023
bc7c9b6
modified: docs/proposals/algorithms/lifelong-learning/Implementati…
qxygxt Aug 3, 2023
c8f2cf2
new file: docs/proposals/algorithms/lifelong-learning/Implementati…
qxygxt Aug 1, 2023
d4e7cc2
modified: docs/proposals/algorithms/lifelong-learning/Implementati…
qxygxt Aug 3, 2023
1cb115d
Merge branch 'main' of https://github.com/qxygxt/ianvs
qxygxt Aug 4, 2023
9e94e83
modified: docs/proposals/algorithms/lifelong-learning/Implementati…
qxygxt Aug 6, 2023
329fd29
modified: docs/proposals/algorithms/lifelong-learning/Implementati…
qxygxt Aug 6, 2023
a3499c1
modified: docs/proposals/algorithms/lifelong-learning/Implementati…
qxygxt Aug 30, 2023
85a0880
modified: docs/proposals/algorithms/lifelong-learning/Implementati…
qxygxt Aug 31, 2023
0e644ef
Update OSPP-Proposal of Implementation of a Class Incremental Learni…
qxygxt Sep 1, 2023
57da1ba
modified: docs/proposals/algorithms/lifelong-learning/Implementati…
qxygxt Sep 1, 2023
d9438c4
modified: docs/proposals/algorithms/lifelong-learning/Implementati…
qxygxt Sep 1, 2023
531918f
merge completed
qxygxt Sep 1, 2023
f0ab946
modified: docs/proposals/algorithms/lifelong-learning/Implementati…
qxygxt Sep 5, 2023
a47a2f8
final version of proposal
qxygxt Sep 29, 2023
80608a2
code of OSPP MDIL-SS
qxygxt Sep 30, 2023
3a51eb9
modified: docs/proposals/algorithms/lifelong-learning/Implementati…
qxygxt Oct 30, 2023
4ec110f
modified: docs/proposals/algorithms/lifelong-learning/Implementati…
qxygxt Oct 30, 2023
0916438
modified: examples/class_increment_semantic_segmentation/lifelong_…
qxygxt Oct 30, 2023
0fc9822
modified: examples/class_increment_semantic_segmentation/lifelong_…
qxygxt Oct 30, 2023
10796b3
Merge branch 'main' of https://github.com/qxygxt/ianvs
qxygxt Oct 30, 2023
0b62335
Merge branch 'main' of https://github.com/qxygxt/ianvs
qxygxt Oct 30, 2023
9704cd3
Merge branch 'main' of https://github.com/qxygxt/ianvs
qxygxt Oct 31, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,226 @@
# Implementation of a Class Incremental Learning Algorithm Evaluation System based on Ianvs

## 1 Motivation

### 1.1 Background
Currently, lifelong learning is facing a challenge: new classes may appear when models are trained on a new data domain ( for example, in the figure below, three classes in red are new classes in `Domain 2` ), which makes it difficult for models to maintain generalization ability and results in a severe performance drop.

<div align = center>
<img src="images/OSPP_MDIL-SS_7.png" width = "300" height = "250" alt="MDIL-SS" />
</div>

Many algorithms have been proposed to solve the class increment problem in domain shift scenario. However, such algorithms lack a unified testing environment, which is not conducive to comparing algorithms. In some cases, new algorithms are only tested on certain datasets, which is not rigorous.

In this context, it is necessary to develop an algorithm evaluation system that provides standardized testing for class-incremental learning algorithms, which is increasingly widely used in the industry, and evaluates the effectiveness of these algorithms.

[KubeEdge-Ianvs](https://github.com/kubeedge/ianvs) is a distributed collaborative AI benchmarking project which can perform benchmarks with respect to several types of paradigms (e.g. single-task learning, incremental learning, etc.). This project aims to leverage the benchmarking capabilities of Ianvs to develop an evaluation system for class-incremental learning algorithms, in order to fulfill the benchmarking requirements specific to this type of algorithm.

### 1.2 Goals

This project aims to build a benchmarking for class-incremental learning in domain shift scenario on KubeEdge-Ianvs, it includes:
- Reproduce the Multi-Domain Incremental Learning for Semantic Segmentation (MDIL-SS) algorithm proposed in the [WACV2022 paper](https://github.com/prachigarg23/MDIL-SS).
- Use three datasets (including Cityscapes, SYNTHIA, and the Cloud-Robotic dataset provided by KubeEdge SIG AI) to conduct benchmarking tests and generate a comprehensive test report (including rankings, time, algorithm name, dataset, and test metrics, among other details).

## 2 Proposal

`Implementation of a Class Incremental Learning Algorithm Evaluation System based on Ianvs` taking MDIL-SS algorithm as an example, aims to test the performance of class-incremental learning models following benchmarking standards, to make the development more efficient and productive.

The scope of the system includes

- A test case for class-incremental learning semantic segmentation algorithms, in which a test report can be successfully generated following instructions.
- Easy to expand, allowing users to seamlessly integrate existing algorithms into the system for testing.

Targeting users include

- Beginners: Familiarize with distributed synergy AI and lifelong learning, among other concepts.
- Developers: Quickly integrate class-increment algorithms into Ianvs and test the performance for further optimization.


## 3 Design Details

### 3.1 Overall Design

The following is the architecture diagram of this project system, and this project focuses on the `unknown task processing` module.

Before entering this module, unknown tasks have been [detected](https://github.com/kubeedge/ianvs/tree/4ae10f0e5e1ab958e143b04fade4acc448009857/examples/scene-based-unknown-task-recognition/lifelong_learning_bench) and samples have been labeled by some means such as manual labeling. The core concern of this module is how to use unknown task samples (i.e., incremental class samples) to update the model.
![MDIL-SS](images/OSPP_MDIL-SS_6.png)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the meaning of the dotted line box in the figure? It is best to explain it in the figure. Secondly, why do you need to upload the reasoning results of unseen tasks to the cloud instead of directly uploading unseen samples to the cloud.

Copy link
Contributor Author

@qxygxt qxygxt Aug 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the first suggestion, it has been clarified in the figure.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, as for the second problem, I can see that the module 5-1 identifies unseen tasks and transmits data to the cloud for processing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify the second problem: in the framework of this project, unseen samples are detected at the edge, uploaded to the cloud for labeling, and eventually used for unseen task processing (project-focused part). The reason for this design is that in many cases, the computing and storage capacity of the edge is not enough to support model training, so unseen samples need to be uploaded to the cloud for labeling and training. The cloud is significantly better than the edge in terms of computing performance, and it can label and process unknown samples more quickly and efficiently.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your explanation. I understand.


The following diagram shows how the algorithm works in Ianvs.

![MDIL-SS](images/OSPP_MDIL-SS_8.png)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much for the revision and it looks much better. Nevertheless, similar to the above design, the project scope and module design are also not yet very clear to most community members in the figure. It is thus suggested to include the module designs, i.e., how would other modules be implemented.


### 3.2 Dataset

This project will use three datasets, namely **Cityscapes**, **SYNTHIA**, and KubeEdge SIG AI's **Cloud-Robotics** dataset (**CS**, **SYN**, **CR**).

Ianvs has already provides [Cityscapes and SYNTHIA datasets](https://github.com/kubeedge/ianvs/blob/main/docs/proposals/algorithms/lifelong-learning/Additional-documentation/curb_detetion_datasets.md). The following two images are examples from them respectively.

| CS Example | SYN Example |
| :----------------------------------------------------------: | :----------------------------------------------------------: |
| ![MDIL-SS](images/OSPP_MDIL-SS_1.png) |![MDIL-SS](images/OSPP_MDIL-SS_2.png) |

In addition, this project utilizes the CR dataset from KubeEdge.

| CR Example |
| :----------------------------------------------------------: |
| ![MDIL-SS](images/OSPP_MDIL-SS_3.png) |

The following code is an excerpt from the `train-index-mix.txt` file. The first column represents the path to the original image, and the second column represents the corresponding label image path.

```txt
rgb/train/20220420_garden/00480.png gtFine/train/20220420_garden/00480_TrainIds.png
rgb/train/20220420_garden/00481.png gtFine/train/20220420_garden/00481_TrainIds.png
rgb/train/20220420_garden/00483.png gtFine/train/20220420_garden/00483_TrainIds.png
```

The following code snippet is an excerpt from the `test-index.txt` file, which follows a similar format to the training set.

```txt
rgb/test/20220420_garden/01357.png gtFine/test/20220420_garden/01357_TrainIds.png
rgb/test/20220420_garden/01362.png gtFine/test/20220420_garden/01362_TrainIds.png
rgb/test/20220420_garden/01386.png gtFine/test/20220420_garden/01386_TrainIds.png
rgb/test/20220420_garden/01387.png gtFine/test/20220420_garden/01387_TrainIds.png
```

As shown in the table below, this dataset contains 7 groups and 30 classes.

| Group | Classes |
| :----------: | :----------------------------------------------------------: |
| flat | road · sidewalk · ramp · runway |
| human | person · rider |
| vehicle | car · truck · bus · train · motorcycle · bicycle |
| construction | building · wall · fence · stair · curb · flowerbed · door |
| object | pole · traffic sign · traffic light · CCTV camera · Manhole · hydrant · belt · dustbin |
| nature | vegetation · terrain |
| sky | sky |

More detail about CR dataset please refer to [this link](https://github.com/kubeedge/ianvs/blob/main/docs/proposals/scenarios/Cloud-Robotics/Cloud-Robotics_zh.md).

### 3.3 File-level Design

The development consists of two main parts, which are **test environment (test env)** and **test algorithms**.

Test environment can be understood as an exam paper, which specifies the dataset, evaluation metrics, and the number of increments used for testing. It is used to evaluate the performance of the "students". And test algorithms can be seen as the students who will take the exam.

<div align = center>
<img src="images/OSPP_MDIL-SS_4.png"alt="MDIL-SS" />
</div>

In addition, `benchmarkingjob.yaml` is used for integrating the configuration of test env and test algorithms, and is a necessary ianvs configuration file.

For test env, the development work mainly focuses on the implementation of `mIoU.py`. And for test algorithms, development is concentrated on `basemodel.py`, as shown in the picture below.

![MDIL-SS](images/OSPP_MDIL-SS_5.png)

#### 3.3.1 Test Environment

The following code is the `testenv.yaml` file designed for this project.

As a configuration file for test env, it contains the 3 aspects, which are the dataset and the number of increments, model validation logic, and model evaluation metrics.

```yaml
# testenv.yaml

testenv:

# 1
dataset:
train_url: "/home/QXY/ianvs/dataset/mdil-ss-dataset/train_data/index.txt"
test_url: "/home/QXY/ianvs/dataset/mdil-ss-dataset/test_data/index.txt"
using: "CS SYN CR"
incremental_rounds: 3

# 2
model_eval:
model_metric:
name: "mIoU"
url: "/home/QXY/ianvs/examples/mdil-ss/testenv/mIoU.py"
threshold: 0
operator: ">="

# 3
metrics:
- name: "mIoU"
url: "/home/QXY/ianvs/examples/mdil-ss/testenv/mIoU.py"
- name: "BWT"
- name: "FWT"
```

After each round of lifelong learning, the model will be evaluated on the validation set. In this project, **mIoU** (mean Intersection over Union) is used as the evaluation metric. If the model achieves an mIoU greater than the specified threshold on the validation set, the model will be updated.

**BWT** (Backward Transfer) and **FWT** (Forward Transfer) are two important concepts in the field of lifelong learning. BWT refers to the impact of previously learned knowledge on the learning of the current task, while FWT refers to the impact of the current task on the learning of future tasks. Along with mIoU, they serve as testing metrics to assess the lifelong learning capability of the model in semantic segmentation. Functions related to BWT and FWT have already been implemented in [Ianvs repository](https://github.com/kubeedge/ianvs/blob/main/core/testcasecontroller/metrics/metrics.py).

#### 3.3.2 Test Algorithm

The following code is the `mdil-ss_algorithm.yaml` file designed for this project.

```yaml
# mdil-ss_algorithm.yaml

algorithm:
paradigm_type: "incrementallearning"

incremental_learning_data_setting:
train_ratio: 0.8
splitting_method: "default"

modules:
- type: "basemodel"

# 1
name: "ERFNet"
url: "/home/QXY/ianvs/examples/mdil-ss/testalgorithms/mdil-ss/basemodel.py"

# 2
hyperparameters:
- learning_rate:
values:
- 0.01
- 0.0001
- epochs:
values:
- 5
- 10
- batch_size:
values:
- 10
- 20
```

First, `basemodel.py`, which involves encapsulating various functional components of the model, including its architecture, layers, and operations, which is the focus of development.

Second, **hyperparameters** setting for the model is also defined in this yaml file. In addition, the evaluation system can perform tests with multiple combinations of hyperparameters at once by configuring multiple hyperparameters in `mdil-ss_algorithm.yaml`.

#### 3.3.3 Test Report

The test report is designed as follows, which contains the ranking, algorithm name, three metrics, dataset name, base model, three hyperparameters, and time.

| Rank | Algorithm | mIoU | BWT | FWT | Paradigm | Round | Dataset | Basemodel | Learning_rate | Epoch | Batch_size | Time |
| ---- | :-------: | ------ | ----- | ----- | ---------------- | ----- | --------- | --------- | ------------- | ----- | ---------- | ------------------- |
| 1 | MDIL-SS | 0.6521 | 0.075 | 0.021 | Lifelonglearning | 3 | CS SYN CR | ERFNet | 0.0001 | 1 | 10 | 2023-05-28 17:05:15 |

## 4 Roadmap
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before getting into advanved algorithm development, it is suggested to ensure all modules work with a simple version first.


### 4.1 Phase 1 (July 1st - August 15th)

- Engage in discussions with the project mentor and the community to finalize the development details.

- Further refine the workflow of the MDIL-SS testing task, including the relationships between different components and modules.

- Develop the test environment, including datasets and model metrics.

- Begin the development of the base model encapsulation for the test algorithms.

### 4.2 Phase 2 (August 16th - September 30th)

- Summarize the progress of Phase 1 and generate relevant documentation.

- Complete the remaining development tasks, including models, test reports, etc.

- Generate initial algorithm evaluation reports.

- Engage in discussions with the project mentor and the community to further supplement and improve the project.

- Organize the project code and related documentation, and merge them into the Ianvs repository.

- Upon merging into the repository, explore new research areas and produce additional outcomes based on this project.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.