BigQuery ML Model Types and Training

Q: How do I prevent data leakage in BigQuery ML training?

Three steps. First, use data_split_method='SEQ' with a timestamp data_split_col for any data with a time dimension. Second, build feature columns inside the TRANSFORM clause so future-looking aggregations cannot accidentally appear. Third, never include the label or any direct derivative of the label as a feature; ML.FEATURE_INFO will not catch this for you.

Q: What is the difference between ARIMA_PLUS and ARIMA_PLUS_XREG?

ARIMA_PLUS models a univariate time series: it sees only the target value over time and uses trend, seasonality, holidays, and anomalies. ARIMA_PLUS_XREG adds exogenous regressors, columns that influence the target but are not the target. Marketing spend, weather, store openings, or promotional flags are typical regressors. Use ARIMA_PLUS_XREG when you have known future values for those regressors, since the forecast horizon needs them as inputs.

Introduction to BigQuery ML Model Types and Training

BigQuery ML lets data engineers train machine learning models with plain SQL, right inside the warehouse where the data already lives. Instead of exporting CSVs to a notebook, BigQuery ML model types and training move the compute to the data, which keeps governance, lineage, and cost in one place. This study note walks through every model family the PDE exam expects you to know, from linear regression to ARIMA_PLUS forecasting and matrix factorization, and then ties them together with the training mechanics: TRANSFORM clauses, hyperparameter tuning, evaluation splits, and the Vertex AI Model Registry.

白話文解釋（Plain English Explanation）

Before diving into syntax, here are three concrete pictures that make BigQuery ML model types and training easier to remember during the exam.

BigQuery ML as a kitchen with a built-in chef

Imagine a restaurant where the walk-in fridge, the prep station, and the chef all share one room. You point at a pile of ingredients and say "make me a soup," and the chef cooks it on the spot without anyone wheeling carts back and forth. That is BigQuery ML. The data sits in BigQuery storage, and CREATE MODEL is the chef who pulls those ingredients off the shelf and produces a trained artifact in the same room. Compare that with the older workflow: pulling rows into Pandas, training in a notebook on your laptop, and uploading a pickle file somewhere. That is wheeling ingredients to a chef across town. BigQuery ML model types and training collapse the kitchen.

The model types are different cuisines on the same menu. Linear regression is a simple grilled fish: predictable, easy to explain, runs on almost any input. K-means is a buffet sorter that groups similar dishes together without a recipe. ARIMA_PLUS is the dessert chef who watches what people ate yesterday and predicts what they will order tomorrow. They share the kitchen, but each has its own technique.

Model selection as picking the right power tool

A handyman owns a toolbox with a screwdriver, a power drill, a circular saw, and a laser level. The screwdriver is fine for a loose hinge but wrong for cutting plywood. BigQuery ML model types and training work the same way. Logistic regression is the screwdriver for a yes-or-no question like "will this customer churn?" Boosted tree (XGBoost) is the power drill that handles messy mixed inputs and usually wins Kaggle leaderboards. DNN is the circular saw, very capable but overkill if a screwdriver would do. PCA and matrix factorization are the laser level, used to align and reduce, not to make predictions directly. The mistake on the exam is grabbing the saw when the screwdriver was the right answer.

Training, evaluation, and tuning as a driving test

Learning to drive has three stages. You practice in an empty parking lot (training data), you take a mock test on a quiet road (evaluation split), and finally you take the official road test (held-out test set or production traffic). BigQuery ML model types and training mirror this. The OPTIONS(data_split_method=...) clause decides how the parking lot, mock test, and road test are carved out. Hyperparameter tuning is like having an instructor who tries five different driving styles and picks the smoothest one. Skip evaluation and you are issuing licenses based on parking-lot performance alone, which is exactly how production models embarrass their owners.

Core Concepts of BigQuery ML Model Types and Training

BigQuery ML model types and training are organized around a small set of SQL primitives. CREATE MODEL creates and trains a model in one step. ML.PREDICT, ML.FORECAST, ML.RECOMMEND, and ML.EVALUATE consume the trained model. ML.TRIAL_INFO and ML.WEIGHTS introspect what was learned. Every model lives inside a BigQuery dataset, just like a table or view, and inherits the dataset's IAM and location.

The model families fall into four groups. Supervised regression and classification covers LINEAR_REG, LOGISTIC_REG, BOOSTED_TREE_REGRESSOR, BOOSTED_TREE_CLASSIFIER, DNN_REGRESSOR, DNN_CLASSIFIER, WIDE_AND_DEEP_REGRESSOR, and WIDE_AND_DEEP_CLASSIFIER. Unsupervised covers KMEANS and PCA. Recommendation covers MATRIX_FACTORIZATION. Time series covers ARIMA_PLUS and ARIMA_PLUS_XREG. AutoML Tables is exposed as AUTOML_REGRESSOR and AUTOML_CLASSIFIER, which run Vertex AI AutoML behind the scenes but stay addressable from SQL.

A single pass over the training data during gradient descent. BigQuery ML reports per-iteration loss in the model's training info, which you query with ML.TRAINING_INFO(MODEL <name>). Source: https://cloud.google.com/bigquery/docs/reference/standard-sql/bigqueryml-syntax-train

Training data and evaluation data are controlled by the data_split_method option. Valid values include AUTO_SPLIT (the default for most supervised models), RANDOM, CUSTOM (you supply a boolean column), SEQ (chronological for time-aware splits), and NO_SPLIT. The exam likes to ask which method respects time order; the answer is SEQ paired with data_split_col.

Architecture and Design Patterns

A typical BigQuery ML pipeline has four layers. The raw layer holds source tables, often loaded from Cloud Storage or streamed in via the Storage Write API. The feature layer is a curated dataset of views or scheduled queries that produce model-ready columns. The model layer holds the trained models themselves, versioned by name suffix or by Model Registry version. The serving layer is either an online prediction endpoint exported to Vertex AI or a scheduled query that writes batch predictions back to BigQuery for downstream BI.

The pattern that the PDE exam rewards is keeping feature engineering inside a TRANSFORM clause rather than baking it into upstream views. When transformations live in the model, they travel with the model. ML.PREDICT automatically applies the same bucketing, scaling, or one-hot encoding to the input rows. If the same logic lives in a view, every consumer must remember to use the right view, and a forgotten step at prediction time creates training-serving skew.

The TRANSFORM clause is part of the model artifact. Predictions made with ML.PREDICT re-apply the exact same preprocessing automatically. Skipping TRANSFORM and pre-computing features in a view is the most common source of training-serving skew on the PDE exam. Source: https://cloud.google.com/bigquery/docs/auto-preprocessing

For high-cardinality categorical features, the design pattern is ML.FEATURE_CROSS combined with ML.QUANTILE_BUCKETIZE for numeric inputs, then a hashing step using ML.HASH_BUCKETIZE. For text, ML.NGRAMS and ML.TF_IDF produce sparse vectors that linear models digest cleanly. The DNN and Wide and Deep models are happy with raw numeric and string columns because BigQuery ML applies automatic preprocessing such as standardization for numerics and vocabulary lookup for strings.

GCP Service Deep Dive: BigQuery ML

This section walks each model type with the syntax fragment you should be able to recognize on the exam.

Linear and logistic regression

CREATE OR REPLACE MODEL `mart.churn_logit`
OPTIONS(
  model_type = 'LOGISTIC_REG',
  input_label_cols = ['churned'],
  data_split_method = 'AUTO_SPLIT',
  l1_reg = 0.01,
  enable_global_explain = TRUE
) AS
SELECT * EXCEPT(customer_id) FROM `mart.customer_features`;

LINEAR_REG predicts a numeric value, LOGISTIC_REG predicts a class probability. Both accept L1 and L2 regularization, automatic class weights, and early_stop. They are cheap, fast, and explainable, which is why they appear in answer choices when the question stresses interpretability or low cost.

K-means clustering

CREATE OR REPLACE MODEL `mart.customer_segments`
OPTIONS(
  model_type = 'KMEANS',
  num_clusters = HPARAM_RANGE(3, 10),
  standardize_features = TRUE
) AS
SELECT recency, frequency, monetary FROM `mart.rfm`;

K-means is unsupervised, so it has no input_label_cols. The exam often pairs K-means with hyperparameter tuning because choosing num_clusters is a textbook use of HPARAM_RANGE. The Davies-Bouldin index and mean squared distance are the standard evaluation metrics returned by ML.EVALUATE.

Deep neural networks

DNN_REGRESSOR and DNN_CLASSIFIER expose hidden_units, activation_fn, dropout, batch_size, learn_rate, and optimizer. They are the right pick when there are nonlinear interactions that linear models miss. They are the wrong pick when training data is small, when latency budgets are tight, or when the team needs to explain individual predictions to auditors.

Wide and Deep

WIDE_AND_DEEP_CLASSIFIER combines a memorization arm (wide linear) with a generalization arm (deep). Use it when the dataset has both sparse categorical signals (user IDs, product IDs) and dense numeric signals (price, hour of day). It is a niche answer on the exam, usually distinguishing recommendation-ish problems that are not pure collaborative filtering.

AutoML Tables (AUTOML_REGRESSOR / AUTOML_CLASSIFIER)

CREATE OR REPLACE MODEL `mart.lead_score`
OPTIONS(
  model_type = 'AUTOML_CLASSIFIER',
  budget_hours = 1.0,
  input_label_cols = ['converted']
) AS
SELECT * FROM `mart.lead_features`;

AutoML Tables runs a managed neural architecture and feature search on Vertex AI but lets you stay in BigQuery for orchestration. The trade-off is cost and time. The exam clue is "we want the best possible model and we don't care which algorithm" combined with "the team does not have ML expertise."

Time series with ARIMA_PLUS and ARIMA_PLUS_XREG

CREATE OR REPLACE MODEL `mart.daily_orders_forecast`
OPTIONS(
  model_type = 'ARIMA_PLUS',
  time_series_timestamp_col = 'order_date',
  time_series_data_col = 'orders',
  time_series_id_col = 'store_id',
  holiday_region = 'US',
  auto_arima = TRUE
) AS
SELECT order_date, orders, store_id FROM `mart.orders_daily`;

ARIMA_PLUS handles trend, seasonality, holidays, and anomaly cleansing automatically. time_series_id_col lets a single model train one ARIMA per store, which is the scalable answer for thousands of SKUs or locations. ARIMA_PLUS_XREG adds external regressors such as marketing spend or temperature. Forecasts come from ML.FORECAST, not ML.PREDICT.

Boosted tree (XGBoost)

BOOSTED_TREE_REGRESSOR and BOOSTED_TREE_CLASSIFIER wrap XGBoost. They handle missing values, categorical features, and mixed scales without much preprocessing. They tend to win on tabular benchmarks. Key options are num_parallel_tree, max_tree_depth, subsample, colsample_bytree, and learn_rate. Combined with hyperparameter tuning, this is the default answer on the exam when the question says "highest accuracy on tabular data" without an interpretability constraint.

PCA

CREATE OR REPLACE MODEL `mart.feature_pca`
OPTIONS(
  model_type = 'PCA',
  num_principal_components = 10,
  scale_features = TRUE
) AS
SELECT * EXCEPT(user_id) FROM `mart.user_features`;

PCA is a dimensionality reduction model. Output is consumed via ML.PRINCIPAL_COMPONENTS and ML.PRINCIPAL_COMPONENT_INFO, and projected scores come from ML.PREDICT. Use it to compress hundreds of correlated columns into a handful of components before feeding them into K-means or a downstream regression.

Matrix Factorization

CREATE OR REPLACE MODEL `mart.movie_recs`
OPTIONS(
  model_type = 'MATRIX_FACTORIZATION',
  user_col = 'user_id',
  item_col = 'movie_id',
  rating_col = 'rating',
  feedback_type = 'EXPLICIT',
  num_factors = 32
) AS
SELECT user_id, movie_id, rating FROM `mart.ratings`;

Matrix factorization is the recommendation workhorse. feedback_type='EXPLICIT' is for star ratings; 'IMPLICIT' is for clicks, watches, or purchases. Recommendations come from ML.RECOMMEND. This model type requires reservation slots in many regions, which is a frequent exam gotcha.

MATRIX_FACTORIZATION requires a flat-rate or autoscaling reservation in many regions because it cannot run on on-demand slots. Candidates miss this when the question hints at "no reservation purchased." The correct answer often becomes "buy a reservation" or "use a different model type." Source: https://cloud.google.com/bigquery/docs/bigqueryml-mf-implicit-tutorial

ML.FEATURE Functions and Manual Preprocessing

The ML.FEATURE_* family lets you write preprocessing in SQL that the TRANSFORM clause then bakes into the model. The most common ones are:

ML.STANDARD_SCALER(x) — zero mean, unit variance.
ML.MIN_MAX_SCALER(x) — rescale to 0..1.
ML.QUANTILE_BUCKETIZE(x, n) — equal-frequency buckets.
ML.BUCKETIZE(x, [boundaries]) — custom-boundary buckets.
ML.FEATURE_CROSS(STRUCT(a, b)) — interaction terms.
ML.HASH_BUCKETIZE(x, n) — hash-trick for high-cardinality strings.
ML.ONE_HOT_ENCODER(x) — explicit one-hot when you do not trust the default vocabulary.
ML.NGRAMS and ML.TF_IDF — text features.

A TRANSFORM clause looks like this:

CREATE OR REPLACE MODEL `mart.price_model`
TRANSFORM(
  ML.STANDARD_SCALER(sqft) AS sqft_z,
  ML.QUANTILE_BUCKETIZE(year_built, 10) OVER() AS yr_bucket,
  ML.FEATURE_CROSS(STRUCT(zip, yr_bucket)) AS zip_x_yr,
  price
)
OPTIONS(model_type='LINEAR_REG', input_label_cols=['price']) AS
SELECT sqft, year_built, zip, price FROM `mart.listings`;

Anything not listed in TRANSFORM is dropped. Anything listed becomes part of the model's serving signature. At prediction time you pass the raw sqft, year_built, and zip; the model handles the rest.

Always select the label inside the TRANSFORM clause even if you do not transform it. If you forget, BigQuery ML cannot find the label column at training time and the model creation fails with a confusing error. Source: https://cloud.google.com/bigquery/docs/reference/standard-sql/bigqueryml-syntax-create

Hyperparameter Tuning

BigQuery ML supports built-in hyperparameter tuning on most supervised model types and on K-means. You replace a fixed value with HPARAM_RANGE(min, max) for continuous parameters or HPARAM_CANDIDATES([a, b, c]) for discrete ones, then add tuning options.

CREATE OR REPLACE MODEL `mart.churn_xgb_tuned`
OPTIONS(
  model_type = 'BOOSTED_TREE_CLASSIFIER',
  input_label_cols = ['churned'],
  num_trials = 20,
  max_parallel_trials = 4,
  hparam_tuning_objectives = ['ROC_AUC'],
  learn_rate = HPARAM_RANGE(0.01, 0.3),
  max_tree_depth = HPARAM_CANDIDATES([4, 6, 8, 10]),
  l2_reg = HPARAM_RANGE(0.0, 1.0)
) AS
SELECT * EXCEPT(customer_id) FROM `mart.customer_features`;

Trial results are inspectable with ML.TRIAL_INFO(MODEL <name>). The default search algorithm is Vertex AI Vizier, a Bayesian optimizer that learns from earlier trials to pick smarter hyperparameters next. max_parallel_trials controls how many trials run concurrently; higher parallelism finishes faster but slightly hurts the Bayesian signal because trials cannot learn from siblings still in flight.

The exam likes the distinction between num_trials (total trials) and max_parallel_trials (concurrent trials). It also likes that hparam_tuning_objectives defaults to a sensible metric per model type (RMSE for regression, ROC AUC for binary classification, Davies-Bouldin for K-means).

Training and Evaluation Splits

data_split_method controls how rows become training, evaluation, and (optionally) test sets. The five values you must memorize:

AUTO_SPLIT — default. Random for small data, sequential for very large data.
RANDOM — pure random split using data_split_eval_fraction.
CUSTOM — you supply a boolean column via data_split_col; TRUE rows go to evaluation.
SEQ — sorts by data_split_col and uses the tail as evaluation. Use for time series or when chronological order matters.
NO_SPLIT — entire dataset trains; nothing for evaluation. Common for unsupervised models or when evaluation is done downstream.

Evaluation metrics depend on model type. ML.EVALUATE returns RMSE, MAE, and R-squared for regression; precision, recall, accuracy, F1, log loss, and ROC AUC for classification; Davies-Bouldin and mean squared distance for K-means; mean average precision for matrix factorization; explained variance for PCA; and forecasting metrics like MAE and SMAPE for ARIMA_PLUS via ML.EVALUATE plus ML.ARIMA_EVALUATE.

Memorize the required OPTIONS keys per model family because the exam hides one wrong key in the answer choices. ARIMA_PLUS needs time_series_timestamp_col, time_series_data_col, and (for multi-series) time_series_id_col; MATRIX_FACTORIZATION needs user_col, item_col, rating_col, and feedback_type (EXPLICIT or IMPLICIT); KMEANS uses num_clusters; PCA uses num_principal_components; supervised models use input_label_cols. Source: https://cloud.google.com/bigquery/docs/reference/standard-sql/bigqueryml-syntax-create

For time series models, the only valid split that respects chronology is SEQ with data_split_col set to your timestamp. Random or auto splits will leak future information into the training set and inflate evaluation metrics, which is a classic exam trap. Source: https://cloud.google.com/bigquery/docs/reference/standard-sql/bigqueryml-syntax-create-time-series

Vertex AI Model Registry Integration

BigQuery ML models can be registered to the Vertex AI Model Registry with OPTIONS(model_registry='VERTEX_AI', vertex_ai_model_id='my_model', vertex_ai_model_version_aliases=['default']). Once registered, the model gets a Vertex AI resource name, can be deployed to an online prediction endpoint, can be evaluated alongside non-BQML models, and inherits the registry's versioning and staging aliases.

The workflow on the exam usually goes: train in BigQuery for cost reasons, register to Vertex AI for governance, deploy to an endpoint for low-latency online prediction, and keep batch predictions inside BigQuery via ML.PREDICT or scheduled queries. The Model Registry also gives you Model Evaluation slices and feature attribution viewing, which the BigQuery console only shows partially.

You can export many BigQuery ML models to Cloud Storage as TensorFlow SavedModel or XGBoost Booster files using EXPORT MODEL. That export path is how you ship a BigQuery-trained model to a Cloud Run service or a Vertex AI custom container without going through the Registry.

Common Pitfalls and Trade-offs

Slot pressure is the silent killer. A BOOSTED_TREE with num_trials=50 on a billion-row training set can saturate a 2000-slot reservation for hours. Without a reservation, on-demand pricing applies and the bill arrives as a surprise. The mitigation is to sample first with TABLESAMPLE SYSTEM (10 PERCENT) while iterating and only run the full training when the design is settled.

Training-serving skew shows up when feature engineering happens in two places. Either put it all in TRANSFORM or all in a versioned view that both training and prediction queries reference. Splitting the logic between an upstream pipeline and an ad-hoc cleanup in the training query is how teams ship models that look great offline and fail in production.

Cardinality blowups hurt linear and DNN models when raw user IDs or product IDs land in the feature set. Use ML.HASH_BUCKETIZE to cap dimensionality, or move to matrix factorization where high-cardinality IDs are the point.

Holiday and timezone handling in ARIMA_PLUS surprises new users. The holiday_region option only knows public holidays, not company-specific events like Black Friday. For those, use ARIMA_PLUS_XREG and pass the holiday calendar as an exogenous variable.

Reservations are required for several model types. Matrix factorization needs them in most regions. Hyperparameter tuning trials count as separate jobs and consume slots in parallel; without a reservation, you may hit per-project on-demand quotas.

Best Practices

Start with the simplest model that could work. Linear or logistic regression sets a baseline that boosted trees and DNNs must beat to justify their cost and complexity.
Put preprocessing inside TRANSFORM so the same logic ships with the model.
Use SEQ splits for any data with a time component, even when the model itself is not a time series.
Reserve slots before kicking off long hyperparameter tuning runs; estimate cost with --dry_run queries first.
Register production models to the Vertex AI Model Registry to inherit versioning, staging, and feature attribution tooling.
Monitor training-serving skew by logging input distributions in BigQuery and comparing them to the model's training input statistics returned by ML.FEATURE_INFO.
Schedule retraining with a scheduled query or Cloud Composer DAG so models do not drift silently.

Real-World Use Case

A mid-sized e-commerce retailer with 12 million customers and 800 stores wants three things: a churn score per customer updated nightly, a 30-day demand forecast per store-SKU pair updated weekly, and a recommendations carousel on the product page updated hourly.

The data engineering team builds three BigQuery ML model types and training pipelines. For churn, a BOOSTED_TREE_CLASSIFIER with hyperparameter tuning runs nightly inside a 1000-slot autoscaling reservation. The TRANSFORM clause buckets recency and bucketizes lifetime value; predictions land in a customer_churn_scores table and feed Salesforce via Reverse ETL. For demand forecasting, a single ARIMA_PLUS model with time_series_id_col='store_sku' produces forecasts for roughly 400,000 series in a few hours of slot time, far cheaper than training one model per series. For recommendations, a MATRIX_FACTORIZATION model with feedback_type='IMPLICIT' trains on click and add-to-cart events, registers to Vertex AI Model Registry, and serves online via a Vertex AI endpoint that the website calls with userId.

Total cost runs at a small fraction of the previous setup, which used a Spark cluster for training and a hand-rolled prediction service. The win is not just dollars; it is that the analytics team can read the SQL and reason about what the models actually do.

Exam Tips

The PDE exam tests BigQuery ML model types and training in three ways. First, scenario-to-model mapping. Read the question for the keywords: "explainable" points to linear or logistic; "tabular highest accuracy" points to boosted tree; "no labels, group similar customers" points to K-means; "predict next 30 days" points to ARIMA_PLUS; "users and items" points to matrix factorization; "team has no ML expertise, find best model automatically" points to AutoML Tables.

Second, syntax recognition. Know that time_series_timestamp_col, time_series_data_col, and time_series_id_col are ARIMA_PLUS options. Know that user_col, item_col, and rating_col are matrix factorization options. Know that num_clusters belongs to K-means and num_principal_components belongs to PCA. The exam likes to hide a wrong option name in the answer choices.

Third, training mechanics. Be ready for questions on data_split_method='SEQ' for time-aware splits, TRANSFORM for avoiding training-serving skew, and HPARAM_RANGE for hyperparameter tuning. Remember that matrix factorization needs reservation slots, that hyperparameter tuning runs trials in parallel up to max_parallel_trials, and that Vertex AI Model Registry integration is enabled with a single OPTIONS clause.

When the question describes a forecasting problem with multiple series (one per store, one per SKU, one per region), the correct answer is a single ARIMA_PLUS model with time_series_id_col, not a loop that creates one model per series. The single-model approach is dramatically cheaper and is the documented best practice. Source: https://cloud.google.com/bigquery/docs/reference/standard-sql/bigqueryml-syntax-create-time-series

Frequently Asked Questions (FAQ)

When should I use BigQuery ML instead of Vertex AI custom training?

Use BigQuery ML when the data already lives in BigQuery, the model family fits one of the supported types, and you want to avoid moving data out of the warehouse. Use Vertex AI custom training when you need a non-supported algorithm (for example, a custom PyTorch transformer), when you need GPUs or TPUs at training time, or when the team already has containerized training code. The two services are complementary; many teams train in BigQuery ML and serve via Vertex AI endpoints.

How do I prevent data leakage in BigQuery ML training?

Three steps. First, use data_split_method='SEQ' with a timestamp data_split_col for any data with a time dimension. Second, build feature columns inside the TRANSFORM clause so future-looking aggregations cannot accidentally appear. Third, never include the label or any direct derivative of the label as a feature; ML.FEATURE_INFO will not catch this for you.

What is the difference between ARIMA_PLUS and ARIMA_PLUS_XREG?

ARIMA_PLUS models a univariate time series: it sees only the target value over time and uses trend, seasonality, holidays, and anomalies. ARIMA_PLUS_XREG adds exogenous regressors, columns that influence the target but are not the target. Marketing spend, weather, store openings, or promotional flags are typical regressors. Use ARIMA_PLUS_XREG when you have known future values for those regressors, since the forecast horizon needs them as inputs.

Can BigQuery ML models be exported and run outside BigQuery?

Yes, with caveats. Linear, logistic, K-means, matrix factorization, boosted tree, DNN, and Wide and Deep models can be exported via EXPORT MODEL to Cloud Storage as TensorFlow SavedModel or XGBoost Booster files. AutoML and ARIMA_PLUS exports have different formats and limitations. Once exported, you can deploy to Vertex AI endpoints, Cloud Run, GKE, or even on-premises. Models registered to the Vertex AI Model Registry can also deploy directly to managed endpoints without manual export.

How does hyperparameter tuning bill in BigQuery ML?

Each trial is a separate training job and consumes slots accordingly. With on-demand pricing, you pay per byte processed across all trials; with capacity-based pricing, the trials draw from your reservation. max_parallel_trials controls concurrency, which affects wall-clock time but not total slot consumption. To control cost, set num_trials conservatively, sample the training data while iterating, and only run the full tuning sweep on the final feature set.

What evaluation metrics should I trust for an imbalanced classification problem?

For imbalanced classes (for example, 2% churn rate), accuracy is misleading because predicting "no churn" for everyone scores 98%. Use ROC AUC for ranking quality, PR AUC and F1 for the rare class, and a confusion matrix at your chosen threshold. ML.EVALUATE returns all of these, and you can also call ML.CONFUSION_MATRIX and ML.ROC_CURVE for deeper inspection.

BigQuery Data Modeling and Clustering — partitioning and clustering choices that shape feature tables for BQML.
Vertex AI Pipelines and Model Registry — orchestrate retraining and govern BQML models alongside custom Vertex AI models.
Feature Engineering with Dataflow and BigQuery — when feature work belongs upstream in Dataflow versus inside a TRANSFORM clause.

BigQuery ML: Model Training

Introduction to BigQuery ML Model Types and Training

白話文解釋（Plain English Explanation）

BigQuery ML as a kitchen with a built-in chef

Model selection as picking the right power tool

Training, evaluation, and tuning as a driving test

Core Concepts of BigQuery ML Model Types and Training

Architecture and Design Patterns

GCP Service Deep Dive: BigQuery ML

Linear and logistic regression

K-means clustering

Deep neural networks

Wide and Deep

AutoML Tables (AUTOML_REGRESSOR / AUTOML_CLASSIFIER)

Time series with ARIMA_PLUS and ARIMA_PLUS_XREG

Boosted tree (XGBoost)

PCA

Matrix Factorization

ML.FEATURE Functions and Manual Preprocessing

Hyperparameter Tuning

Training and Evaluation Splits

Vertex AI Model Registry Integration

Common Pitfalls and Trade-offs

Best Practices

Real-World Use Case

Exam Tips

Frequently Asked Questions (FAQ)

When should I use BigQuery ML instead of Vertex AI custom training?

How do I prevent data leakage in BigQuery ML training?

What is the difference between ARIMA_PLUS and ARIMA_PLUS_XREG?

Can BigQuery ML models be exported and run outside BigQuery?

How does hyperparameter tuning bill in BigQuery ML?

What evaluation metrics should I trust for an imbalanced classification problem?

Further Reading

Official sources

More PDE topics

Introduction to BigQuery ML Model Types and Training

白話文解釋（Plain English Explanation）

BigQuery ML as a kitchen with a built-in chef

Model selection as picking the right power tool

Training, evaluation, and tuning as a driving test

Core Concepts of BigQuery ML Model Types and Training

Architecture and Design Patterns

GCP Service Deep Dive: BigQuery ML

Linear and logistic regression

K-means clustering

Deep neural networks

Wide and Deep

AutoML Tables (AUTOML_REGRESSOR / AUTOML_CLASSIFIER)

Time series with ARIMA_PLUS and ARIMA_PLUS_XREG

Boosted tree (XGBoost)

PCA

Matrix Factorization

ML.FEATURE Functions and Manual Preprocessing

Hyperparameter Tuning

Training and Evaluation Splits

Vertex AI Model Registry Integration

Common Pitfalls and Trade-offs

Best Practices

Real-World Use Case

Exam Tips

Frequently Asked Questions (FAQ)

When should I use BigQuery ML instead of Vertex AI custom training?

How do I prevent data leakage in BigQuery ML training?

What is the difference between ARIMA_PLUS and ARIMA_PLUS_XREG?

Can BigQuery ML models be exported and run outside BigQuery?

How does hyperparameter tuning bill in BigQuery ML?

What evaluation metrics should I trust for an imbalanced classification problem?

Related Topics

Further Reading

Official sources

More PDE topics