Google’s AutoML Overview

In this post, we are exploring how Google’s AutoML can help us in Almeta in developing automatic Arabic language processing tools.

Before start if you are not familiar with the term AutoML you can refer to our previous post on this topic.

Who is Google AutoML for? and When to Use It?

The targeted audience by Google’s cloud autoML are people who have limited knowledge in machine learning.

The main goal of this cloud service is to let the user build his own AI model that is tailored to his business needs, if the provided services by Google’s AI API can’t satisfy his needs, even if he doesn’t have enough knowledge in machine learning.

In general, anyone can use these services to build a custom AI model on the fly.

What Kinds of AutoML Services Does Google Provide?

Let’s see what kinds of AutoML services are provided by Google in the NLP field, and whether they can be adapted to process the Arabic language.

Cloud AutoML Natural Language Classification

Enables you to create custom machine learning models to classify content into a custom set of categories.

According to Google’s documentation, the current service supports content classification in English language text. We can train a custom model to classify text in other languages including Arabic, but the model quality may vary.

How to train the model?

Build your own dataset, and upload it as .csv file. The trained model is automatically deployed, and we can get its predictions using an API.

Cloud AutoML Natural Language Entity Extraction

Enables you to create custom machine learning models to identify a custom set of entities.

According to Google’s documentation, this service currently supports entity analysis in English language text. We can train a custom model using text in other languages, but the model performance is undetermined.

How to train the model?

Annotate a dataset, then upload it in JSON format. The annotation can be done before uploading the data or after using Google’s AutoML UI, or the user can request annotation from Google’s labeling service. The trained model is automatically deployed, and we can get its predictions using an API.

Cloud AutoML Natural Language Sentiment Analysis

Enables you to create custom machine learning models to analyze attitudes.

According to Google’s documentation, the current service supports sentiment analysis in English language text. We can train a custom model to classify text in other languages including Arabic, but the model quality may vary.

The sentiment score is an integer ranging from 0 (relatively negative) to a maximum value of your choice (positive). So, if we want to identify whether the sentiment is negative, positive, or neutral, we would label the training data with sentiment scores of 0 (negative), 1 (neutral), and 2 (positive).

How to train the model?

Build your own dataset, then upload it as .csv file. The trained model is automatically deployed, and we can get its predictions using an API.

Cloud AutoML Natural Language Pricing

Training Cost: The cost of training a model is $3.00 per hour.

Prediction Cost: The usage of AutoML Natural Language is calculated monthly in terms of how many text records were sent for analysis during the billing month, as follows:

If a document contains more than 1,000 characters, it counts as one text record for every 1,000 characters.

Feature0 – 30K30K+ – 5M+
AutoML Natural Language Content ClassificationFree$5.00
AutoML Natural Language Sentiment AnalysisFree$5.00
AutoML Natural Language Entity ExtractionFree$5.00

Cloud AutoML Translation

Enables you to create custom translation models so that translation queries return results specific to a defined domain.

The supported language pairs can be found here which include Arabic to English (and vice versa) translation

How to train the model?

AutoML Translation trains custom models using matching pairs of sentences in the source and target languages.

The sentence pairs used to train the custom model must be in Tab-separated values (.tsv) or Translation Memory eXchange (.tmx) format. A multiple .tsv and .tmx files can be batched into a comma-separated values (.csv) file. AutoML Translation uses the sentence pairs you provide to train, validate, and test the custom model.

The trained model is automatically deployed, and we can get its predictions using an API.

Cloud AutoML Translation Pricing

Training Cost: The cost for training a model is $76.00 per hour, If training fails for any reason other than a user-initiated cancelation, you will not be billed for the time.

Translation Cost: Your usage of AutoML Translation is calculated in terms of how many characters you send for translation with an AutoML custom model.

0 – .5 million characters.5 – 5 million characters
TranslationFree$80 per 1,000,000 characters*

Price is per character sent for processing, including whitespace characters. Empty queries are charged for one character.

Cloud AutoML Tables

Enables you to automatically build and deploy state-of-the-art machine learning models on structured data at massively increased speed and scale. Here are its features and capabilities:

Data support

Helps in creating clean, effective training data by providing information about missing data, correlation, cardinality, and distribution for each of your features.

Feature engineering

Automatically performs common feature engineering tasks, including:

  • Normalize and bucketize numeric features.
  • Create one-hot encoding and embeddings for categorical features.
  • Perform basic processing for text features.
  • Extract date– and time-related features from Timestamp columns.

Model training

Training for multiple model architectures at the same time. The model architectures AutoML Tables tests include:

  • Linear
  • Feedforward deep neural network
  • Gradient Boosted Decision Tree
  • AdaNet
  • Ensembles of various model architectures

Model evaluation and final model creation

Using a validation set, determine the best model architecture for the data. After that two kinds of models are trained:

  1. A model trained with the training and validation sets. this model is used to predict the test set targets to provide the evaluation of this model.
  2. A model trained with the training, validation, and test sets. This is the model that is provided to be used to make predictions.

Supported Problem Types

  • Regression problems
  • Classification problems

Cloud AutoML Tables Pricing

Prices for the usage of AutoML Tables are computed based on the underlying GCP resources required for model training, model deployment, batch prediction, and online prediction. You don’t incur charges from AutoML Tables until you start training your model.

Model training costs: Model training costs $19.32 per hour of compute resources used to train the model.

Model deployment costs: Model deployment costs $0.005 per GiB per hour per machine that a model is deployed. They currently replicate the model to memory in 9 machines for low latency serving purposes, so there is a 9x multiplier applied to this cost.

Batch prediction costs: Batch prediction using the model costs $1.16 per hour of computing resources used.

Online prediction costs: Online predictions using the model cost $0.21 per hour of compute resources used.

Conclusion

In this post, we talked about the services provided by Google’s AutoML, including Cloud AutoML Natural Language, Cloud AutoML Translation, and Cloud AutoML Tables, their fitness for processing Arabic texts, and their pricing.

Do you know that we use all this and other AI technologies in our app? Look at what you’re reading now applied in action. Try our Almeta News app. You can download it from Google Play or Apple’s App Store.

Leave a Reply

Your email address will not be published. Required fields are marked *