Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update platform references #3304

Merged
merged 3 commits into from
May 20, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 8 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
<a href="https://www.mosaicml.com">[Website]</a>
- <a href="https://docs.mosaicml.com/projects/composer/en/stable/getting_started/installation.html">[Getting Started]</a>
- <a href="https://docs.mosaicml.com/projects/composer/">[Docs]</a>
- <a href="https://www.mosaicml.com/careers">[We're Hiring!]</a>
- <a href="https://www.databricks.com/company/careers/open-positions?department=Mosaic%20AI&location=all">[We're Hiring!]</a>
</p></h4>

<p align="center">
Expand Down Expand Up @@ -236,18 +236,17 @@ Here are some resources actively maintained by the Composer community to help yo
</tbody>
</table>

# 🛠️ For Best Results, Use with the MosaicML Ecosystem
# 🛠️ For Best Results, Use within the Databricks & MosaicML Ecosystem

Composer can be used on its own, but for the smoothest experience we recommend using it in combination with other components of the MosaicML ecosystem:

![We recommend that you train models with Composer, MosaicML StreamingDatasets, and the MosaicML platform.](docs/source/_static/images/ecosystem.png)
![We recommend that you train models with Composer, MosaicML StreamingDatasets, and Mosaic AI training.](docs/source/_static/images/ecosystem.png)

- [**MosaicML platform**](https://www.mosaicml.com/training) (MCLI)- Our proprietary Command Line Interface (CLI) and Python SDK for orchestrating, scaling, and monitoring the GPU nodes and container images executing training and deployment. Used by our customers for training their own Generative AI models.
- **To get started, [sign up here](https://www.mosaicml.com/get-started?utm_source=blog&utm_medium=referral&utm_campaign=llama2) to apply for access and check out our [Training](https://www.mosaicml.com/training) and [Inference](https://www.mosaicml.com/inference) product pages**
- [**Mosaic AI training**](https://www.databricks.com/product/machine-learning/mosaic-ai-training) (MCLI)- Our proprietary Command Line Interface (CLI) and Python SDK for orchestrating, scaling, and monitoring the GPU nodes and container images executing training and deployment. Used by our customers for training their own Generative AI models.
- **To get started, [reach out here](https://www.databricks.com/company/contact) and check out our [Training](https://www.databricks.com/product/machine-learning/mosaic-ai-training) product pages**
- [**MosaicML LLM Foundry**](https://github.com/mosaicml/llm-foundry) - This open source repository contains code for training, finetuning, evaluating, and preparing LLMs for inference with [Composer](https://github.com/mosaicml/composer). Designed to be easy to use, efficient and flexible, this codebase is designed to enable rapid experimentation with the latest techniques.
- [**MosaicML StreamingDataset**](https://github.com/mosaicml/streaming) - Open-source library for fast, accurate streaming from cloud storage.
- [**MosaicML Diffusion**](https://github.com/mosaicml/diffusion) - Open-source code to train your own Stable Diffusion model on your own data. Learn more via our blogs: ([Results](https://www.mosaicml.com/blog/stable-diffusion-2) , [Speedup Details](https://www.mosaicml.com/blog/diffusion))
- [**MosaicML Examples**](https://github.com/mosaicml/examples) - This repo contains reference examples for using the [MosaicML platform](https://www.notion.so/Composer-README-Draft-5d30690d40f04cdf8528f749e98782bf?pvs=21) to train and deploy machine learning models at scale. It's designed to be easily forked/copied and modified.

# **🏆 Project Showcase**

Expand All @@ -258,7 +257,7 @@ Here are some projects and experiments that used Composer. Got something to add?
- [MPT-7B-8k Blog](https://www.mosaicml.com/blog/long-context-mpt-7b-8k)
- [MPT-30B Blog](https://www.mosaicml.com/blog/mpt-30b)
- [**Mosaic Diffusion Models**](https://www.mosaicml.com/blog/training-stable-diffusion-from-scratch-costs-160k): see how we trained a stable diffusion model from scratch for <$50k
- [**replit-code-v1-3b**](https://huggingface.co/replit/replit-code-v1-3b): A 2.7B Causal Language Model focused on **Code Completion,** trained by Replit on the MosaicML platform in 10 days.
- [**replit-code-v1-3b**](https://huggingface.co/replit/replit-code-v1-3b): A 2.7B Causal Language Model focused on **Code Completion,** trained by Replit on Mosaic AI training in 10 days.
- **BabyLLM:** the first LLM to support both Arabic and English. This 7B model was trained by MetaDialog on the world’s largest Arabic/English dataset to improve customer support workflows ([Blog](https://blogs.nvidia.com/blog/2023/08/31/generative-ai-startups-africa-middle-east/))
- [**BioMedLM**](https://www.mosaicml.com/blog/introducing-pubmed-gpt): a domain-specific LLM for Bio Medicine built by MosaicML and [Stanford CRFM](https://crfm.stanford.edu/)

Expand All @@ -268,15 +267,15 @@ Composer is part of the broader Machine Learning community, and we welcome any c

To start contributing, see our [Contributing](https://github.com/mosaicml/composer/blob/dev/CONTRIBUTING.md) page.

P.S.: [We're hiring](https://www.mosaicml.com/careers)!
P.S.: [We're hiring](https://www.databricks.com/company/careers/open-positions?department=Mosaic%20AI&location=all)!

# ❓FAQ

- **What is the best tech stack you recommend when training large models?**
- We recommend that users combine components of the MosaicML ecosystem for the smoothest experience:
- Composer
- [StreamingDataset](https://github.com/mosaicml/streaming)
- [MCLI](https://www.mosaicml.com/training) (MosaicML platform)
- [MCLI](https://www.databricks.com/product/machine-learning/mosaic-ai-training) (Databricks Mosaic AI Training)
- **How can I get community support for using Composer?**
- You can join our [Community Slack](https://mosaicml.me/slack)!
- **How does Composer compare to other trainers like NeMo Megatron and PyTorch Lightning?**
Expand Down
4 changes: 2 additions & 2 deletions composer/cli/launcher.py
Original file line number Diff line number Diff line change
Expand Up @@ -549,11 +549,11 @@ def main():
if os.environ.get(MOSAICML_PLATFORM_ENV_VAR, 'false').lower() == 'true' and str(
os.environ.get(MOSAICML_LOG_DIR_ENV_VAR, 'false'),
).lower() != 'false' and os.environ.get(MOSAICML_GPU_LOG_FILE_PREFIX_ENV_VAR, 'false').lower() != 'false':
log.info('Logging all GPU ranks to Mosaic Platform.')
log.info('Logging all GPU ranks to Mosaic AI Training.')
log_file_format = f'{os.environ.get(MOSAICML_LOG_DIR_ENV_VAR)}/{os.environ.get(MOSAICML_GPU_LOG_FILE_PREFIX_ENV_VAR)}{{local_rank}}.txt'
if args.stderr is not None or args.stdout is not None:
log.info(
'Logging to Mosaic Platform. Ignoring provided stdout and stderr args. To use provided stdout and stderr, set MOSAICML_LOG_DIR=false.',
'Logging to Mosaic AI Training. Ignoring provided stdout and stderr args. To use provided stdout and stderr, set MOSAICML_LOG_DIR=false.',
)
args.stdout = log_file_format
args.stderr = None
Expand Down
8 changes: 4 additions & 4 deletions composer/loggers/mosaicml_logger.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Copyright 2022 MosaicML Composer authors
# SPDX-License-Identifier: Apache-2.0

"""Log to the MosaicML platform."""
"""Log to Mosaic AI Training."""

from __future__ import annotations

Expand Down Expand Up @@ -42,12 +42,12 @@


class MosaicMLLogger(LoggerDestination):
"""Log to the MosaicML platform.
"""Log to Mosaic AI Training.

Logs metrics to the MosaicML platform. Logging only happens on rank 0 every ``log_interval``
Logs metrics to Mosaic AI Training. Logging only happens on rank 0 every ``log_interval``
seconds to avoid performance issues.

When running on the MosaicML platform, the logger is automatically enabled by Trainer. To disable,
When running on Mosaic AI Training, the logger is automatically enabled by Trainer. To disable,
the environment variable 'MOSAICML_PLATFORM' can be set to False.

Args:
Expand Down
2 changes: 1 addition & 1 deletion docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -144,6 +144,6 @@ Composer is part of the broader Machine Learning community, and we welcome any c
api_reference/*


.. _Twitter: https://twitter.com/mosaicml
.. _Twitter: https://twitter.com/DbrxMosaicAI
.. _Email: mailto:community@mosaicml.com
.. _Slack: https://mosaicml.me/slack
aspfohl marked this conversation as resolved.
Show resolved Hide resolved
2 changes: 1 addition & 1 deletion examples/checkpoint_autoresume.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
"\n",
"We've put together this tutorial to demonstrate this feature in action and how you can activate it through the Composer trainer.\n",
"\n",
"**🐕 Autoresume via Watchdog**: Composer autoresumption works best when coupled with automated node failure detection and retries on the MosaicML platform. \n",
"**🐕 Autoresume via Watchdog**: Composer autoresumption works best when coupled with automated node failure detection and retries on Mosaic AI training. \n",
"See our [platform docs page](https://docs.mosaicml.com/projects/mcli/en/latest/training/watchdog.html) on enabling this feature for your runs\n",
"\n",
"### Recommended Background\n",
Expand Down
Loading