CPU and M1/M2 GPU platform support #71

xiezhq-hermann · 2023-02-26T22:23:37Z

Minimal modification to extend FlexGen to CPU and M1/M2 GPU platforms.
Not fully tested with various offloading settings.
@Ying1123 @merrymercy

HIRANO-Satoshi · 2023-02-27T03:15:01Z

I tried.

0af9051 *   main Merge branch 'xiezhq-hermann/main'
        |\  
18482fa | * xiezhq-hermann/main update CPU and m1/m2
980ca74 | *   merge latest main
        | |\  
332849a | * | enable CPU and M1/M2 platform
fea8321 * | | origin/main update version
896e1e0 * | | Update README.md
50ae8ad * | | Delete README.md
9d888e5 * | Move apps into flexgen package (#70)

Seems something wrong.

ppa-hirano:FlexGen hirano-s$ python3 -m flexgen.flex_opt --model facebook/opt-1.3b
Exception in thread Thread-1 (copy_worker_func):
Traceback (most recent call last):
  File "/opt/homebrew/Cellar/python@3.10/3.10.9/Frameworks/Python.framework/Versions/3.10/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
Exception in thread Thread-2 (copy_worker_func):
Traceback (most recent call last):
  File "/opt/homebrew/Cellar/python@3.10/3.10.9/Frameworks/Python.framework/Versions/3.10/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
Exception in thread Thread-3 (copy_worker_func):
model size: 2.443 GB, cache size: 0.398 GB, hidden size (prefill): 0.008 GB
init weight...
Exception in thread Thread-4 (copy_worker_func):
Traceback (most recent call last):
Traceback (most recent call last):
  File "/opt/homebrew/Cellar/python@3.10/3.10.9/Frameworks/Python.framework/Versions/3.10/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
  File "/opt/homebrew/Cellar/python@3.10/3.10.9/Frameworks/Python.framework/Versions/3.10/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
Traceback (most recent call last):
  File "/opt/homebrew/Cellar/python@3.10/3.10.9/Frameworks/Python.framework/Versions/3.10/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    self.run()
  File "/opt/homebrew/Cellar/python@3.10/3.10.9/Frameworks/Python.framework/Versions/3.10/lib/python3.10/threading.py", line 953, in run
    self.run()
  File "/opt/homebrew/Cellar/python@3.10/3.10.9/Frameworks/Python.framework/Versions/3.10/lib/python3.10/threading.py", line 953, in run
    self.run()
    self._target(*self._args, **self._kwargs)
  File "/Users/hirano-s/dev/FlexGen/flexgen/pytorch_backend.py", line 917, in copy_worker_func
    self.run()
  File "/opt/homebrew/Cellar/python@3.10/3.10.9/Frameworks/Python.framework/Versions/3.10/lib/python3.10/threading.py", line 953, in run
  File "/opt/homebrew/Cellar/python@3.10/3.10.9/Frameworks/Python.framework/Versions/3.10/lib/python3.10/threading.py", line 953, in run
    self._target(*self._args, **self._kwargs)
  File "/Users/hirano-s/dev/FlexGen/flexgen/pytorch_backend.py", line 917, in copy_worker_func
    self._target(*self._args, **self._kwargs)
  File "/Users/hirano-s/dev/FlexGen/flexgen/pytorch_backend.py", line 917, in copy_worker_func
    return _run_code(code, main_globals, None,
  File "/opt/homebrew/Cellar/python@3.10/3.10.9/Frameworks/Python.framework/Versions/3.10/lib/python3.10/runpy.py", line 86, in _run_code
    self._target(*self._args, **self._kwargs)
    torch.cuda.set_device(device_id)
  File "/opt/homebrew/lib/python3.10/site-packages/torch/cuda/__init__.py", line 326, in set_device
    torch.cuda.set_device(device_id)
  File "/opt/homebrew/lib/python3.10/site-packages/torch/cuda/__init__.py", line 326, in set_device
  File "/Users/hirano-s/dev/FlexGen/flexgen/pytorch_backend.py", line 917, in copy_worker_func
    torch.cuda.set_device(device_id)
  File "/opt/homebrew/lib/python3.10/site-packages/torch/cuda/__init__.py", line 326, in set_device
    exec(code, run_globals)
    torch._C._cuda_setDevice(device)
    torch.cuda.set_device(device_id)
  File "/opt/homebrew/lib/python3.10/site-packages/torch/cuda/__init__.py", line 326, in set_device
AttributeError: module 'torch._C' has no attribute '_cuda_setDevice'
    torch._C._cuda_setDevice(device)
  File "/Users/hirano-s/dev/FlexGen/flexgen/flex_opt.py", line 1334, in <module>
AttributeError: module 'torch._C' has no attribute '_cuda_setDevice'
    torch._C._cuda_setDevice(device)
    torch._C._cuda_setDevice(device)
AttributeError: module 'torch._C' has no attribute '_cuda_setDevice'
AttributeError: module 'torch._C' has no attribute '_cuda_setDevice'
    run_flexgen(args)
  File "/Users/hirano-s/dev/FlexGen/flexgen/flex_opt.py", line 1218, in run_flexgen
    model = OptLM(opt_config, env, args.path, policy)
  File "/Users/hirano-s/dev/FlexGen/flexgen/flex_opt.py", line 617, in __init__
    self.load_weight_stream = torch.cuda.Stream()
  File "/opt/homebrew/lib/python3.10/site-packages/torch/cuda/streams.py", line 34, in __new__
    return super(Stream, cls).__new__(cls, priority=priority, **kwargs)
TypeError: object.__new__() takes exactly one argument (the type to instantiate)
Exception ignored in: <function OptLM.__del__ at 0x11b250040>
Traceback (most recent call last):
  File "/Users/hirano-s/dev/FlexGen/flexgen/flex_opt.py", line 1148, in __del__
    self.delete_all_weights()
  File "/Users/hirano-s/dev/FlexGen/flexgen/flex_opt.py", line 803, in delete_all_weights
    self.delete_weight(j, 0)
  File "/Users/hirano-s/dev/FlexGen/flexgen/flex_opt.py", line 669, in delete_weight
    for x in self.weight_home[j].pop():
AttributeError: 'OptLM' object has no attribute 'weight_home'
ppa-hirano:FlexGen hirano-s$

xiezhq-hermann · 2023-02-27T03:29:57Z

@HIRANO-Satoshi Did you just run the code on your Mac machine? If so, you should add --platform "mps:0" into the command. If you tested it on a machine with NVIDIA GPU, can you try the latest commit (I merged it for you) and rebuild FlexGen? I am not sure what the codes you just ran are.

HIRANO-Satoshi · 2023-02-27T03:57:29Z

I don't have NVIDIA. With --platform cpu, it start working. Thanks much!

Maybe apps/completion.py needs the --platform option.

ppa-hirano:FlexGen hirano-s$ python3 -m flexgen.apps.completion --model facebook/opt-1.3b
...
 File "/opt/homebrew/lib/python3.10/site-packages/torch/cuda/__init__.py", line 326, in set_device
    torch._C._cuda_setDevice(device)
AttributeError: module 'torch._C' has no attribute '_cuda_setDevice'
'''

ppa-hirano:FlexGen hirano-s$ python3 -m flexgen.apps.completion --model facebook/opt-1.3b --platform cpu
usage: completion.py [-h] [--model MODEL] [--path PATH]
[--offload-dir OFFLOAD_DIR]
[--percent PERCENT [PERCENT ...]]
[--pin-weight [PIN_WEIGHT]] [--compress-weight]
[--compress-cache]
completion.py: error: unrecognized arguments: --platform cpu

HIRANO-Satoshi · 2023-02-27T05:02:06Z

A proper default without an explicit option would be better.

I'm curious how Apple Neural Engine is fast.

Ying1123 requested review from Ying1123 and merrymercy February 26, 2023 22:25

This comment was marked as outdated.

Sign in to view

Ying1123 closed this Mar 1, 2023

Ying1123 force-pushed the main branch from c4decfc to 1417e8f Compare March 1, 2023 06:26

xiezhq-hermann mentioned this pull request Mar 1, 2023

CPU and M1/M2 GPU platform support #80

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CPU and M1/M2 GPU platform support #71

CPU and M1/M2 GPU platform support #71

xiezhq-hermann commented Feb 26, 2023

HIRANO-Satoshi commented Feb 27, 2023

xiezhq-hermann commented Feb 27, 2023

This comment was marked as outdated.

HIRANO-Satoshi commented Feb 27, 2023

HIRANO-Satoshi commented Feb 27, 2023

CPU and M1/M2 GPU platform support #71

CPU and M1/M2 GPU platform support #71

Conversation

xiezhq-hermann commented Feb 26, 2023

HIRANO-Satoshi commented Feb 27, 2023

xiezhq-hermann commented Feb 27, 2023

This comment was marked as outdated.

HIRANO-Satoshi commented Feb 27, 2023

HIRANO-Satoshi commented Feb 27, 2023