Can one utilize the configuration file for the CMLE model deployment when using a distributed ML model training to define how many machines will be used in training?

by Hema Gunasekaran / Tuesday, 14 November 2023 / Published in Artificial Intelligence, EITC/AI/GCML Google Cloud Machine Learning, Google Cloud AI Platform, Training models with custom containers on Cloud AI Platform

When using distributed machine learning (ML) model training on Google Cloud AI Platform, you can indeed utilize the configuration file for the CMLE (Cloud Machine Learning Engine) model deployment to define the number of machines used in training. However, it is not possible to directly define the type of machines that will be used.

In distributed ML model training, the CMLE model deployment configuration file allows you to specify the scale tier for training. The scale tier determines the number and type of machines used in the training job. The scale tier options range from BASIC to CUSTOM, with each tier having a predefined number of workers and parameter servers. By selecting the appropriate scale tier, you can control the number of machines used for training.

For example, if you choose the scale tier BASIC, it will use a single worker and no parameter servers. On the other hand, if you choose the scale tier STANDARD_1, it will use one worker and one parameter server. The scale tier PREMIUM_1 uses one worker and four parameter servers, while the scale tier CUSTOM allows you to specify the number of workers and parameter servers explicitly.

However, while you can define the number of machines, you cannot directly specify the type of machines used in training. The type of machines used is determined by the scale tier and is predefined by Google Cloud AI Platform. Each scale tier has a default machine type associated with it, which is optimized for the given scale tier. For example, the BASIC scale tier uses the n1-standard-1 machine type, while the STANDARD_1 scale tier uses the n1-standard-4 machine type.

If you require more control over the machine types used in training, you can use custom containers with Cloud AI Platform. With custom containers, you can build and deploy your own training image, which allows you to specify the machine types and other dependencies required for training. By creating a custom container, you have the flexibility to define the exact machine types that suit your training needs.

When using distributed ML model training on Google Cloud AI Platform, you can define the number of machines used for training through the CMLE model deployment configuration file. However, you cannot directly specify the type of machines used, as it is determined by the scale tier. If you require more control over machine types, you can leverage custom containers to build and deploy your own training image.

EITCA Academy

Can one utilize the configuration file for the CMLE model deployment when using a distributed ML model training to define how many machines will be used in training?

Other recent questions and answers regarding EITC/AI/GCML Google Cloud Machine Learning:

More questions and answers:

EITCA Academy is a part of the European IT Certification framework

EITCA Academy

SIGN IN YOUR ACCOUNT TO HAVE ACCESS TO DIFFERENT FEATURES

FORGOT YOUR DETAILS?

CREATE ACCOUNT

Can one utilize the configuration file for the CMLE model deployment when using a distributed ML model training to define how many machines will be used in training?

Other recent questions and answers regarding EITC/AI/GCML Google Cloud Machine Learning:

More questions and answers: