Translate

Sunday, 24 November 2019

Shahed

GCP Google BigQuery Cost Reporting - Control your costs

Setting up the BigQuery audit logs export within GCS

Monitor Your BigQuery Costs in GCP

Google Cloud Platform Instructions




Google Big Query is an excellent database but costs will have to be monitored very closely. You can use the below methods to start monitoring costs on a daily basis. Find out which users are not using filters and every query is costing you a packet...

Setup BigQuery log export : Do the following in a project that contains BigQuery:

Pre-Req : Create a Big Query Project and Data Set.

1. Go to the Google Cloud Platform console and select Logging -> Exports under Stack Driver Logging























2. Click --Exports-- and then Create Export











3. Add a Sink Name  and select Custom Destination as the Sink Service . The Sink 
Destination should be set to bigquery.googleapis.com/projects/<project-name>/datasets/<dataset-name>, adding the project and dataset names you created earlier.

Use drop down arrow on filter and make sure convert to advanced filter contains the following:  resource.type="bigquery_resource"







4. Click Create Sink

If you get a permission error then that is fine (do the following). The project you have set up the export to is different to the project you have set up the logging export in. In this case the Service Account which writes the logs into the BigQuery dataset you have created will not have permission to do so.

1. Go to BigQuery in the project the logs are exported to and click on the dropdown next to the dataset you have chosen. Click Share Dataset.

2. Get the name of the service account by going to Stackdriver Logging in the project where you set up the logging export, then **Exports**, and copy the **Writer Identity**





3. Add this Writer Identity into the Share Dataset window in BigQuery from Step 1

4. Give the account **Can edit** access, and click **Add**, and then **Save Changes**

The BigQuery audit log export is now be set up. The table will be updated periodically. The BigQuery audit log table is date parititioned.

Do this for every project you have so you can get costs for all your projects , you can reuse the same project and dataset for all projects.



Using the Google Cloud SDK Alternative

Use Google Cloud SDK or Gcloud cli.

1. Firstly set your project 
2. gcloud config set project <project-name>

3. Then Run the following command , adjust it to your target dataset and project

gcloud beta logging sinks create <sink_name> bigquery.googleapis.com/projects/<project-name>/datasets/<dataset-name> --log-filter='resource.type="bigquery_resource"

Using your Dataset and Table you can now report your costs in a tool of your choice from the Biq query dataset.

For a list of your projects that are being audited.


SELECT distinctresource.labels.project_id FROM `<dataset-name>.cloudaudit_googleapis_com_data_access_*`

View Query Costs across projects .


SELECT timestamp AS Date,resource.labels.project_id AS ProjectId,protopayload_auditlog.serviceName AS ServiceName,protopayload_auditlog.methodName AS MethodName,protopayload_auditlog.status.code AS StatusCode,protopayload_auditlog.status.message AS StatusMessage,protopayload_auditlog.authenticationInfo.principalEmail AS UserId,protopayload_auditlog.servicedata_v1_bigquery.jobCompletedEvent.job.jobName.jobId AS JobId,    protopayload_auditlog.servicedata_v1_bigquery.jobCompletedEvent.job.jobConfiguration.query.query AS Query,protopayload_auditlog.servicedata_v1_bigquery.jobCompletedEvent.job.jobConfiguration.query.destinationTable.projectId AS DestinationTableProjectId,protopayload_auditlog.servicedata_v1_bigquery.jobCompletedEvent.job.jobConfiguration.query.destinationTable.datasetId AS DestinationTableDatasetId,protopayload_auditlog.servicedata_v1_bigquery.jobCompletedEvent.job.jobConfiguration.query.destinationTable.tableId AS DestinationTableId,protopayload_auditlog.servicedata_v1_bigquery.jobCompletedEvent.job.jobConfiguration.query.createDisposition AS CreateDisposition,protopayload_auditlog.servicedata_v1_bigquery.jobCompletedEvent.job.jobConfiguration.query.writeDisposition AS WriteDisposition,protopayload_auditlog.servicedata_v1_bigquery.jobCompletedEvent.job.jobConfiguration.dryRun AS DryRun,protopayload_auditlog.servicedata_v1_bigquery.jobCompletedEvent.job.jobStatus.state AS JobState,protopayload_auditlog.servicedata_v1_bigquery.jobCompletedEvent.job.jobStatus.error.code AS JobErrorCode,protopayload_auditlog.servicedata_v1_bigquery.jobCompletedEvent.job.jobStatus.error.message AS JobErrorMessage,protopayload_auditlog.servicedata_v1_bigquery.jobCompletedEvent.job.jobStatistics.createTime AS JobCreateTime,protopayload_auditlog.servicedata_v1_bigquery.jobCompletedEvent.job.jobStatistics.startTime AS JobStartTime,protopayload_auditlog.servicedata_v1_bigquery.jobCompletedEvent.job.jobStatistics.endTime AS JobEndTime,protopayload_auditlog.servicedata_v1_bigquery.jobCompletedEvent.job.jobStatistics.billingTier AS BillingTier,protopayload_auditlog.servicedata_v1_bigquery.jobCompletedEvent.job.jobStatistics.totalBilledBytes AS TotalBilledBytes,protopayload_auditlog.servicedata_v1_bigquery.jobCompletedEvent.job.jobStatistics.totalProcessedBytes AS TotalProcessedBytes,protopayload_auditlog.servicedata_v1_bigquery.jobCompletedEvent.job.jobStatistics.totalBilledBytes / 1000000000 AS TotalBilledGigabytes,(protopayload_auditlog.servicedata_v1_bigquery.jobCompletedEvent.job.jobStatistics.totalBilledBytes / 1000000000) / 1000 AS TotalBilledTerabytes,((protopayload_auditlog.servicedata_v1_bigquery.jobCompletedEvent.job.jobStatistics.totalBilledBytes / 1000000000) / 1000) * 5 AS TotalCost,1 AS QueriesFROM `<dataset-name>.cloudaudit_googleapis_com_data_access_*`WHEREprotopayload_auditlog.serviceName = 'bigquery.googleapis.com'AND protopayload_auditlog.methodName = 'jobservice.jobcompleted'AND protopayload_auditlog.servicedata_v1_bigquery.jobCompletedEvent.eventName = 'query_job_completed'




Over and Out

DELIVERBI Team

About Authors

Shahed Munir

Krishna Udathu

Shahed and Krishna are Oracle / Big Data Experts