refactor: user experience improvement after fine-tuning #522

bwanglzu · 2022-09-01T14:58:34Z

take out callback stubs, dependent on finetuner-stubs. DONE
take out model stubs, dependent on finetuner-stubs. DONE
add finetuner-commons as a soft dependency: pip install "finetuner[full]" DONE
provide extra functionality based on finetuner-commons, includs get_model and preprocess_and_collate DONE:

# after fine-tuning
artifact = run.save_artifact('my-model')
# artifacts include model and preprocess/collate function
# re-build the model 
model = finetuner.get_model(artifact=artifact)
# preprocess, collate and encode
finetuner.encode(model=model, data=da)

This PR references an open issue
I have added a line about this change to CHANGELOG

bwanglzu · 2022-09-06T15:05:12Z

related PR in core: https://github.com/jina-ai/finetuner-core/pull/307

guenthermi · 2022-09-07T09:17:14Z

For the clip models the stubs used here before are quite different from the stubs in finetuner-core (and finetuner-commons), i.e., there was only one stub instead of two for clip-text and clip-vision. This should be taken care of, when the code is refactored, e.g., in the get_row method:

finetuner/finetuner/model/__init__.py

Lines 9 to 17 in 5b904f2

    
           def get_row(model_stub) -> Tuple[str, ...]: 
        
               """Get table row.""" 
        
               return ( 
        
                   model_stub.name, 
        
                   model_stub.task, 
        
                   str(model_stub.output_shape[1]), 
        
                   model_stub.architecture, 
        
                   model_stub.description, 
        
               )

bwanglzu · 2022-09-07T09:46:01Z

waiting for PR https://github.com/jina-ai/finetuner-core/pull/308

gmastrapas · 2022-09-08T09:10:16Z

setup.cfg

@@ -6,6 +6,6 @@ version = 0.5.2
 # E203 is whitespace before ':' - which occurs in numpy slicing, e.g. in
 #     dists[2 * i : 2 * i + 2, :]
 # W503 is line break before binary operator - happens when black splits up lines
-ignore = E203, W503
+ignore = E203, W503, F405, F403


what do these ignore?

import * from stubs and commons

gmastrapas · 2022-09-08T09:10:44Z

setup.py

-        install_requires=_main_deps,
+        install_requires=[
+            'docarray[common]>=0.13.31',
+            'finetuner-stubs==0.0.1b1',


we dont have an official version yet?

until we release and trigger core CD

gmastrapas · 2022-09-08T09:12:11Z

finetuner/callback/__init__.py

@@ -0,0 +1 @@
+from stubs.callback import *  # noqa F401


why do you create a directory here? why not a callback.py module?

gmastrapas · 2022-09-08T09:12:29Z

finetuner/model/__init__.py

@@ -0,0 +1,17 @@
+from stubs.model import *  # noqa F401


same here, why not a model.py module?

gmastrapas · 2022-09-08T09:13:24Z

finetuner/run.py

@@ -74,12 +75,22 @@ def logs(self) -> str:
            experiment_name=self._experiment_name, run_name=self._name
        )

-    def stream_logs(self) -> Iterator[str]:
+    def stream_logs(self, interval: int = 5) -> Iterator[str]:


let's document the interval argument here

gmastrapas · 2022-09-08T09:16:30Z

finetuner/__init__.py

-        )
-    }
+def _list_models() -> Dict[str, model.ModelStubType]:
+    from stubs import model as model_stub


why not import at the top? I think this importing inside the function can become a bad habit

gmastrapas · 2022-09-08T09:17:05Z

finetuner/__init__.py

@@ -161,7 +172,7 @@ def fit(
    :param learning_rate: learning rate for the optimizer.
    :param epochs: Number of epochs for fine-tuning.
    :param batch_size: Number of items to include in a batch.
-    :param callbacks: List of callback stub objects. See the `finetuner.callback`
+    :param callbacks: List of callback stub objects.`


` char in the end

gmastrapas · 2022-09-08T09:18:49Z

finetuner/__init__.py

+    select_model: Optional[str] = None,
+    gpu: bool = False,
+    logging_level: str = 'WARNING',
+):


add type hint

gmastrapas · 2022-09-08T09:19:56Z

finetuner/__init__.py

@@ -296,3 +307,82 @@ def get_token() -> str:
    :return: user token as string object.
    """
    return ft.get_token()
+
+
+def get_model(


I dont like the get_model name, maybe build_engine?

no, think about when we support pytorch, it's not ort engine anymore

gmastrapas · 2022-09-08T09:20:13Z

finetuner/__init__.py

+    :param select_model: Finetuner run artifacts might contain multiple models. In
+        such cases you can select which model to deploy using this argument. For CLIP
+        fine-tuning, you can choose either `clip-vision` or `clip-text`.
+    :param gpu: if specified to True, use cuda device for inference.


logging level is missing from the docstring

guenthermi

Added some minor comments

guenthermi · 2022-09-08T09:44:17Z

finetuner/__init__.py

+    """Re-build the model based on the model inference session with ONNX.
+
+    :param artifact: Specify a finetuner run artifact. Can be a path to a local
+        directory, a path to a local zip file or a Hubble artifact ID. Individual


Suggested change

directory, a path to a local zip file or a Hubble artifact ID. Individual

directory, a path to a local zip file, or a Hubble artifact ID. Individual

guenthermi · 2022-09-08T09:47:59Z

finetuner/__init__.py

+    data: DocumentArray,
+    batch_size: int = 32,
+):
+    """Process, collate and encode the `DocumentArray` with embeddings.


I think it is "[Pre-]process"

guenthermi · 2022-09-08T09:48:56Z

finetuner/__init__.py

+    ..Note::
+      please install finetuner[full] to include all the dependencies.
+    """
+    for i, batch in enumerate(data.batch(batch_size)):


where do you need the i

guenthermi

LGTM

refactor: take out callbacks

2f783b9

github-actions bot added size/m area/core area/entrypoint area/setup labels Sep 1, 2022

bwanglzu linked an issue Sep 1, 2022 that may be closed by this pull request

(Refactoring) Add functionality to the Client according to user journey #514

Closed

bwanglzu changed the title ~~refactor: take out callbacks~~ refactor: user experiment improvement after fine-tuning Sep 1, 2022

refactor: clear up dependencies

76c4626

bwanglzu changed the title ~~refactor: user experiment improvement after fine-tuning~~ refactor: user experience improvement after fine-tuning Sep 1, 2022

bwanglzu added 5 commits September 2, 2022 13:23

refactor: update setup requirements

77c2705

feat: add setup

b5297c1

chore: update makefile on setup

0017a8f

refactor: fix setup

a850c7a

fix: callback imports

c0194b8

github-actions bot added the area/testing This issue/PR affects testing label Sep 6, 2022

refactor: keep callback as module

dc6265f

github-actions bot removed the area/testing This issue/PR affects testing label Sep 6, 2022

bwanglzu added 2 commits September 6, 2022 12:46

refactor: take out model from client

b890b5d

refactor: create helper functions

5b904f2

bwanglzu added 3 commits September 7, 2022 14:32

refactor: add get model and encode

58ebf0d

refactor: bump commons to latest

0246f38

test: add integration test

1ae8b48

github-actions bot added the area/testing This issue/PR affects testing label Sep 7, 2022

bwanglzu added 4 commits September 7, 2022 14:54

reafactor: install full in init

8b5a400

refactor: bump commons to b6

d590620

refactor: bump commons to b7

f4afd80

refactor: rename directory to artifact

5d996eb

bwanglzu added 2 commits September 7, 2022 16:57

test: fix argument name

947af4d

test: adjust integration test

0b631bf

guenthermi mentioned this pull request Sep 7, 2022

docs: revise integration section and readme #526

Merged

2 tasks

bwanglzu added 3 commits September 8, 2022 08:33

test: fix artifact path in integration test

475f40d

fix: fix integration test

b9f4b24

chore: add changelog

7a602b8

bwanglzu marked this pull request as ready for review September 8, 2022 06:47

bwanglzu requested review from gmastrapas and guenthermi September 8, 2022 06:47

bwanglzu added 2 commits September 8, 2022 10:39

refactor: improve usability of stream logs

16aea45

chore: add changelog

f459cc1

github-actions bot added size/l and removed size/m labels Sep 8, 2022

chore: bump stubs to 0.0.1b2

b2bfb2f

gmastrapas suggested changes Sep 8, 2022

View reviewed changes

bwanglzu added 2 commits September 8, 2022 11:29

refactor: restructure callback and model

9412922

refactor: remove in function import

970f22c

gmastrapas approved these changes Sep 8, 2022

View reviewed changes

guenthermi reviewed Sep 8, 2022

View reviewed changes

bwanglzu added 2 commits September 8, 2022 11:54

refactor: adjust docstrings

c77a1cc

refactor: remove enumerate in batch

5625267

guenthermi approved these changes Sep 8, 2022

View reviewed changes

bwanglzu merged commit 3e84b18 into main Sep 8, 2022

bwanglzu deleted the refactor-stubs-commons branch September 8, 2022 10:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: user experience improvement after fine-tuning #522

refactor: user experience improvement after fine-tuning #522

bwanglzu commented Sep 1, 2022 •

edited

Loading

bwanglzu commented Sep 6, 2022

guenthermi commented Sep 7, 2022

bwanglzu commented Sep 7, 2022

gmastrapas Sep 8, 2022

bwanglzu Sep 8, 2022

gmastrapas Sep 8, 2022

bwanglzu Sep 8, 2022

gmastrapas Sep 8, 2022

gmastrapas Sep 8, 2022

gmastrapas Sep 8, 2022

gmastrapas Sep 8, 2022

gmastrapas Sep 8, 2022

gmastrapas Sep 8, 2022

gmastrapas Sep 8, 2022

bwanglzu Sep 8, 2022 •

edited

Loading

gmastrapas Sep 8, 2022

guenthermi left a comment

guenthermi Sep 8, 2022

guenthermi Sep 8, 2022

guenthermi Sep 8, 2022

guenthermi left a comment

	directory, a path to a local zip file or a Hubble artifact ID. Individual
	directory, a path to a local zip file, or a Hubble artifact ID. Individual

refactor: user experience improvement after fine-tuning #522

refactor: user experience improvement after fine-tuning #522

Conversation

bwanglzu commented Sep 1, 2022 • edited Loading

bwanglzu commented Sep 6, 2022

guenthermi commented Sep 7, 2022

bwanglzu commented Sep 7, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bwanglzu Sep 8, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

guenthermi left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

guenthermi left a comment

Choose a reason for hiding this comment

bwanglzu commented Sep 1, 2022 •

edited

Loading

bwanglzu Sep 8, 2022 •

edited

Loading