Easily Train Models with H100 GPUs on NVIDIA DGX Cloud

huggingface.co

Updated on March 18 2024


How it works

Training Hugging Face models on NVIDIA DGX Cloud has never been easier. Users need access to an Organization with a Hugging Face Enterprise subscription to utilize Train on DGX Cloud. The service supports various model architectures including Llama, Falcon, Mistral, Mixtral, T5, Gemma, Stable Diffusion, and Stable Diffusion XL. Users can create a training job by configuring hardware, base model, task, and training parameters. Hardware options include NVIDIA H100 GPUs and L40S GPUs, with a straightforward method of uploading training datasets. Once the job is set up, users can start the training process and monitor it through logs. Upon completion, the fine-tuned model is uploaded to a private repository within the selected namespace on the Hugging Face Hub. Train on DGX Cloud is currently available for Enterprise Hub Organizations and is billed per minute of GPU instance usage during training jobs, with detailed pricing information provided for different GPU models.


FAQ

Q: What is Train on DGX Cloud?

A: Train on DGX Cloud is a service that allows users with access to an Organization with a Hugging Face Enterprise subscription to train various model architectures on NVIDIA DGX Cloud.

Q: What are some of the model architectures supported by Train on DGX Cloud?

A: Train on DGX Cloud supports model architectures such as Llama, Falcon, Mistral, Mixtral, T5, Gemma, Stable Diffusion, and Stable Diffusion XL.

Q: How can users create a training job on Train on DGX Cloud?

A: Users can create a training job by configuring hardware, base model, task, and training parameters, selecting from options like NVIDIA H100 GPUs and L40S GPUs.

Q: What is the process of monitoring a training job on Train on DGX Cloud?

A: Users can monitor the training process through logs once the job is set up and started on Train on DGX Cloud.

Q: Where is the fine-tuned model uploaded upon completion of training on Train on DGX Cloud?

A: The fine-tuned model is uploaded to a private repository within the selected namespace on the Hugging Face Hub after completion of training on Train on DGX Cloud.

Q: Who can currently access Train on DGX Cloud?

A: Train on DGX Cloud is currently available for Enterprise Hub Organizations with a Hugging Face Enterprise subscription.

Q: How is Train on DGX Cloud billed?

A: Train on DGX Cloud is billed per minute of GPU instance usage during training jobs, with detailed pricing information provided for different GPU models.

Logo

Get your own AI Agent Today

Thousands of businesses worldwide are using Chaindesk Generative AI platform.
Don't get left behind - start building your own custom AI chatbot now!