Celltype annotation automl #423

xingzhongyu · 2024-02-20T11:49:54Z

Complete the automl task about celltype_annotation. Currently, for each celltype annotation algorithm task, a data set is arbitrarily selected for a full set run. Due to the merge time problem, it does not end completely, but it is close to the complete run. I will wait until this merge is completed before resuming operation.

dance/pipeline.py

xingzhongyu · 2024-02-27T01:11:45Z

dance/pipeline.py

@@ -835,6 +849,22 @@ def save_summary_data(entity, project, sweep_id, summary_file_path):
        result.update({"id": run.id})
        summary_data.append(flatten_dict(result))  # get result and config
    ans = pd.DataFrame(summary_data).set_index(["id"])
+    conf = OmegaConf.load(conf_load_path)


Change the column names of the summary table

xingzhongyu · 2024-02-27T01:15:20Z

dance/pipeline.py

        for x in row:
            for k in conf.pipeline:
                if k["target"] == x:
                    pipeline.append(k)
+        for i, f in zip(required_indexes, required_funs):


Some necessary functions, such as setConfig, can be obtained from the yaml of step 2 instead of the default yaml of step 3.

xingzhongyu · 2024-02-27T01:48:49Z

dance/pipeline.py

+        count += 1
+
+
+def run_step3(MAINDIR, evaluate_pipeline, step2_pipeline_planer: PipelinePlaner, tune_mode="params", sweep_id=None):


The step3 function is run by default

dance/transforms/cell_feature.py

xingzhongyu · 2024-02-27T13:17:51Z

examples/tuning/cta_scdeepsort/main.py

+    parser = argparse.ArgumentParser()
+    parser.add_argument("--batch_size", type=int, default=500)
+    parser.add_argument("--cache", action="store_true", help="Cache processed data.")
+    # parser.add_argument("--dense_dim", type=int, default=400, help="number of hidden gcn units")


There are hyperparameters that both the preprocessing function and the model depend on. Delete them and read them through the configuration file.

xingzhongyu · 2024-02-28T13:37:19Z

examples/tuning/cta_actinn/main.py

+    parser.add_argument("--valid_dataset", nargs="+", default=[1970], help="List of valid dataset ids.")
+    parser.add_argument("--seed", type=int, default=0, help="Initial seed random, offset for each repeatition")
+
+    parser.add_argument("--tune_mode", default="pipeline_params", choices=["pipeline", "params", "pipeline_params"])


Need to add some parameters

xingzhongyu · 2024-02-28T13:38:41Z

examples/tuning/cta_svm/main.py

    parser.add_argument("--log_level", type=str, default="INFO", choices=get_args(LogLevel))
    parser.add_argument("--species", default="mouse")
    parser.add_argument("--test_dataset", nargs="+", default=[2695], type=int, help="list of dataset id")
    parser.add_argument("--tissue", default="Brain")  # TODO: Add option for different tissue name for train/test
    parser.add_argument("--train_dataset", nargs="+", default=[753], type=int, help="list of dataset id")
    parser.add_argument("--valid_dataset", nargs="+", default=[3285], type=int, help="list of dataset id")
-    parser.add_argument("--tune_mode", default="pipeline", choices=["pipeline", "params"])
+    parser.add_argument("--tune_mode", default="pipeline_params", choices=["pipeline", "params", "pipeline_params"])


Add new tune_mode

xingzhongyu · 2024-03-02T12:06:43Z

dance/datasets/singlemodality.py

+            self.train2valid()
+
+    def train2valid(self):
+        logger.info("Copy train_dataset and use it as valid_dataset")


train_dataset can be used as valid_dataset when needed

dance/pipeline.py

xingzhongyu · 2024-03-02T12:12:40Z

dance/transforms/normalize.py

+            n_cells: Optional[int] = None,
+            bin_size: int = 500,
+            bw_adjust: float = 3,
+            processes_num=os.cpu_count(),


The number of processes needs to be parameterized

xingzhongyu · 2024-03-03T13:35:10Z

examples/tuning/cta_svm/main.py

    args = parser.parse_args()
    logger.setLevel(args.log_level)
    logger.info(f"\n{pprint.pformat(vars(args))}")
-    MAINDIR = Path(__file__).resolve().parent
-    pipeline_planer = PipelinePlaner.from_config_file(f"{MAINDIR}/{args.config_dir}{args.tune_mode}_tuning_config.yaml")
+    file_root_path = Path(


Currently, the name of the data set is used as the folder, and the name of the biological tissue can also be used.

xingzhongyu · 2024-03-03T13:36:24Z

dance/pipeline.py

+
+
+def run_step3(root_path, evaluate_pipeline, step2_pipeline_planer: PipelinePlaner, tune_mode="params"):
+    """Run step 3 by default.


Run step 3 by default.

for more information, see https://pre-commit.ci

… celltype_annotation_automl

for more information, see https://pre-commit.ci

… celltype_annotation_automl

for more information, see https://pre-commit.ci

… celltype_annotation_automl

for more information, see https://pre-commit.ci

RemyLau

LGTM

RemyLau · 2024-06-13T16:24:39Z

dance/transforms/misc.py

@@ -117,6 +117,27 @@ def __call__(self, data):
        data.data.raw = data.data.copy()


+@register_preprocessor("misc")
+class UpdateRaw(BaseTransform):
+    """Update raw data.


UpdateRaw is confusing. Change to something like AlignRaw to more accurately reflect its function.

xingzhongyu commented Feb 27, 2024

View reviewed changes

dance/pipeline.py Outdated Show resolved Hide resolved

xingzhongyu commented Feb 27, 2024

View reviewed changes

dance/pipeline.py Show resolved Hide resolved

xingzhongyu commented Feb 27, 2024

View reviewed changes

dance/transforms/cell_feature.py Show resolved Hide resolved

xingzhongyu commented Feb 27, 2024

View reviewed changes

xingzhongyu commented Feb 28, 2024

View reviewed changes

xingzhongyu commented Mar 2, 2024

View reviewed changes

dance/pipeline.py Show resolved Hide resolved

xingzhongyu commented Mar 2, 2024

View reviewed changes

xingzhongyu commented Mar 3, 2024

View reviewed changes

xingzhongyu and others added 16 commits March 20, 2024 09:59

minor change

52608da

minor

5eb19ef

update valid

d276eee

[pre-commit.ci] auto fixes from pre-commit.com hooks

802d4ce

for more information, see https://pre-commit.ci

remove unused imports

13d200a

implement skippable pipeline option; with test

006036e

minor

95a64b4

update config

9d435ed

[pre-commit.ci] auto fixes from pre-commit.com hooks

0ced9f2

for more information, see https://pre-commit.ci

update scdeepsort

40532a1

add examples

4974e32

[pre-commit.ci] auto fixes from pre-commit.com hooks

a861a99

for more information, see https://pre-commit.ci

add examples

b2fa438

add examples

cceb939

[pre-commit.ci] auto fixes from pre-commit.com hooks

1525b32

for more information, see https://pre-commit.ci

add example

2208e75

xingzhongyu and others added 28 commits May 26, 2024 11:55

update main

6fdea0c

Merge remote-tracking branch 'origin/celltype_annotation_automl' into…

32e6ae4

… celltype_annotation_automl

[pre-commit.ci] auto fixes from pre-commit.com hooks

c92563f

for more information, see https://pre-commit.ci

add mape

ae2963c

update main

d2866e7

[pre-commit.ci] auto fixes from pre-commit.com hooks

fd468d6

for more information, see https://pre-commit.ci

update filter

372d028

update filter

d0a9325

update main

5aa074a

[pre-commit.ci] auto fixes from pre-commit.com hooks

538e760

for more information, see https://pre-commit.ci

update main

6e5183d

[pre-commit.ci] auto fixes from pre-commit.com hooks

d3299eb

for more information, see https://pre-commit.ci

update main

1493433

Merge remote-tracking branch 'origin/celltype_annotation_automl' into…

087b8c2

… celltype_annotation_automl

[pre-commit.ci] auto fixes from pre-commit.com hooks

704d2b1

for more information, see https://pre-commit.ci

update filter

a5866f9

update step3 k

671ffa2

[pre-commit.ci] auto fixes from pre-commit.com hooks

54de3dd

for more information, see https://pre-commit.ci

Merge remote-tracking branch 'origin/celltype_annotation_automl' into…

05ccfc7

… celltype_annotation_automl

update pipeline

b2ba428

update pipeline

939a268

update pipeline

8255f8f

[pre-commit.ci] auto fixes from pre-commit.com hooks

3b7c910

for more information, see https://pre-commit.ci

update filter

3e2969f

update filter

73015c7

[pre-commit.ci] auto fixes from pre-commit.com hooks

8dd9235

for more information, see https://pre-commit.ci

update filter

76825d7

update parameter

d4f410f

RemyLau approved these changes Jun 14, 2024

View reviewed changes

JiayuanDing100 merged commit 3620ce1 into main Jun 17, 2024
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Celltype annotation automl #423

Celltype annotation automl #423

xingzhongyu commented Feb 20, 2024 •

edited

Loading

xingzhongyu Feb 27, 2024

xingzhongyu Feb 27, 2024

xingzhongyu Feb 27, 2024 •

edited

Loading

xingzhongyu Feb 27, 2024

xingzhongyu Feb 28, 2024

xingzhongyu Feb 28, 2024

xingzhongyu Mar 2, 2024

xingzhongyu Mar 2, 2024

xingzhongyu Mar 3, 2024

xingzhongyu Mar 3, 2024

RemyLau left a comment

RemyLau Jun 13, 2024

		count += 1


		def run_step3(MAINDIR, evaluate_pipeline, step2_pipeline_planer: PipelinePlaner, tune_mode="params", sweep_id=None):



		def run_step3(root_path, evaluate_pipeline, step2_pipeline_planer: PipelinePlaner, tune_mode="params"):
		"""Run step 3 by default.

Celltype annotation automl #423

Celltype annotation automl #423

Conversation

xingzhongyu commented Feb 20, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

xingzhongyu Feb 27, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

RemyLau left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

xingzhongyu commented Feb 20, 2024 •

edited

Loading

xingzhongyu Feb 27, 2024 •

edited

Loading