Machine Learning Training Options

So, you’ve decided you want to train your own machine learning model to satisfy your business needs… Great!
But what are your options to train the model?

Actually, you need to decide:

  • Whether you want to train your model on your own hardware.
  • Whether you want to rent someone else’s hardware via cloud computing services.
  • Whether to use the cloud machine learning services.
  • Which machine learning service provider is better for your needs.
  • Whether you want to use the automated machine learning services.

This post helps you make these decisions. So let’s find out!

Training On Your Own Device

The first intuitive option to consider for training your machine learning model is using one or more computers owned by you.
Setup your favorite machine learning framework, you may need to preprocess your data and perform feature engineering, choose a suitable model for your case, provide it with the data, and let the training go.

ProsCons
Total control! You can train however you want, whatever you want, and forever you want.You pay for hardware, software, electricity, and everything that keeps your computer running.
Although the hardware is extremely expensive this may be cheaper in the long run.
You own the trained model, you determine its format, and you can easily deploy it wherever you want. You are responsible to set up the environment yourself.
Training the model is up to you. If you are not experienced with this, you have to hire someone to do it.
You’re also completely responsible for the deployment process.

When your model is small enough, training on your own hardware is a sensible choice. However, for big models with lots of training data, using a cloud service makes it easier to scale up quickly when you need more resources.

Training in The Cloud

We have two options here:

Hardware as a Service

You rent hardware in someone else’s data center. For deep learning, you can even rent instances with GPUs. Whatever you do on this hardware is completely up to you.

To train your model do it the same way you used to on your local machine. However, here you need to upload your data to the cloud.

When you’re done, you can download your model and delete the compute instance. You only pay for the compute hours used to train the model. Now you have a trained model that you can use anywhere you like.

ProsCons
Flexibility, no responsibility. If you need more compute power, you can easily and quickly rent more.You are responsible to set up the environment yourself.
You usually train for once, so you can only rent these computers for a limited amount of time and just pay as you go.Training the model is completely up to you. If you are not experienced with this, you have to hire someone to do it.
You’re not limited. Here too, you own the control, you can train however you want, whatever you want.You need to upload your training data to the cloud.
Again here you own the trained model, you determine its format, and you can deploy it wherever you want. You’re also completely responsible for the deployment process.

Machine Learning as a Service

The trend of making everything-as-a-service has affected the Machine Learning industry too. Several companies, such as Amazon, Microsoft, and Google, now offer machine learning as a service on top of their existing cloud services.

We went through a few of the best machine learning platforms on the market in previous posts:

These platforms provide machine learning in two ways:

Traditional Machine Learning

These services offer you with a managed machine learning environment to train your models in. It’s the step between using a fully managed automated ML service, and renting just hardware being responsible for managing and doing everything yourself from scratch.

To train a model you usually need to upload your data to one of the storage services of the same ML service provider. Then you can right away, process your data, pick a training algorithm and start the training process.

ProsCons
Flexibility, no responsibility. If you need more compute power, you can easily and quickly rent more.Training the model is completely up to you. If you are not experienced with this, you have to hire someone to do it.
You usually train for once, so you can only rent these computers for a limited amount of time and just pay as you go.You need to upload your training data to one of the storage services of the same ML service provider, thus you have to pay for the storage too.
Some services provide you with the ability to run your training in parallel on multiple machines without putting any effort. Some may provide you with built-in optimized algorithms.You’re limited somehow. You’re limited to the ML frameworks supported by the service. Although some services provide you with the ability to use your algorithms and preferred frameworks in your own container, setting up a container means you’re responsible again for the environment. Moreover, parallel training is often limited to some algorithms. And the built-in algorithms are limited too.
You’re not responsible for setting up the environment.Although most of the services provide you with the ability to export your trained model, it may be in a special format hard to be handled outside the provider environment. Like the iLearner format of the Azure ML models. It’s worth to say that Google AI Platform is the most flexible on this side.
The provider can take over the deployment process.

If you plan to use these services for your next model training, here is a brief comparison between the most famous three ML services Google AI Platform, AWS SageMaker, and Azure Machine Learning:

Supported FrameworksBuilt-in AlgorithmsGPU Parallel Training NotebookDrag & Drop
Google AI PlatformTensorFlow, Scikit-Learn, XGBoostYES (Beta)YESYESNONO
AWS SageMakerPyTorch, MXNET, Chainer, SparkML, Scikit-LearnYESYESNOYESNO
Azure Machine LearningScikit-Learn, PyTorch, TensorFlow, ChainerNOYESNOYESYES

You can also refer to our previous posts about these services for detailed information.

Automated Machine Learning

Most of the Machine Learning services providers also provide Automated Machine Learning (AutoML) service, which is the process of automating the time consuming, iterative tasks of machine learning model development.
It’s targeted in the first place to non-experienced people to help them build their own models, that are tailored to their business needs.
You can refer to our previous post about AutoML for more detailed information around this field.

ProsCons
You’re not responsible for anything just upload your data to one of the provider storage services. The training and deployment will be handled automatically.In the world of AutoML, we pretend that data exploration and domain knowledge don’t matter. We can only do that for a few limited use cases. So, automated machine learning has a narrow happy path; that is, it’s easy to step off the path and get into trouble. Thus, this may negatively impact on your model performance.
You’re limited to specific algorithms and a narrow scope of problems.
Mostly, you don’t own your model. Most of these services do NOT let you download your trained models. Thus, for the deployment part, you have no choice but to use their platform as well.

If are planning to use AutoML services, we encourage you to take a look at our previous post on Google’s AutoML service.

Finally, Even if you want to train in the cloud, it’s a good practice to start with a reduced dataset on your own computer to make sure the model works correctly. Once you trust the model will give useful predictions, move on to training on the cloud.

Conclusion

In this post, we discussed the options provided for you to train your own machine learning model, and here is our final decision diagram, hoping to help you make your decision:

Training options Decision Diagram

Do you know that we use all this and other AI technologies in our app? Look at what you’re reading now applied in action. Try our Almeta News app. You can download it from Google Play or Apple’s App Store.

Leave a Reply

Your email address will not be published. Required fields are marked *