This article describes the main steps required to create the Amazon Athena SQL data collector. This collector enables you to run Athena queries and stream the results as metrics into Anodot.
Creating an Amazon Athena SQL Data Source
- In the Navigation Panel, go to Integrations > Catalog.
- Use the Search box OR click the Databases filter to locate the data source.
- Hover over the Amazon Athena SQL tile, and click Start. The Athena SQL dialog is displayed, as shown below.
Note: If the data source has already been used, a dialog is displayed in which you can select from one of the listed sources. Alternatively, create a new source by clicking Add a new source. - Create the role in AWS, according to the required permissions listed in the AWS documentation; use the external ID from Anodot in the role ARN for enhanced security. See an example policy at the bottom of this article.
- After choosing a Region and Role ARN in the Athena SQL dialog shown above, click LIST WORKGROUPS to see the list of existing workgroups available to the role.
- Define a combination of workgroup and S3 output location according to the following:
-
- A workgroup is used and has the S3 output location defined within it. In this case, the S3 output location will be taken via one of the following options:
-
- If S3 output location is defined AND “enforce = true” is defined in the chosen workgroup, the output location will be presented in the field without the ability to edit it.
- If S3 output location is defined AND “enforce = false” is defined in the chosen workgroup, the output location will be presented in the field with the ability to edit it.
-
- A workgroup is used, but the S3 output location is not defined in the workgroup. In this case, specify it in the field; the S3 location must already exist when you define the data source.
- If a workgroup is not used, fill in the S3 output location manually; the location must already exist when you define the data source.
- A workgroup is used and has the S3 output location defined within it. In this case, the S3 output location will be taken via one of the following options:
-
- Click LIST DATABASES to get the list of available databases to query, select one of the databases, and then click CONTINUE.
Creating an Amazon Athena SQL Stream Query
If you have just created a Amazon Athena SQL data source, skip to step 3.
- In the Sources page (accessed by clicking Integrations > Sources in the Navigation Panel), choose the Amazon Athena source for which you want to create a stream query.
Note: The streams associated with that source are displayed. If the streams panel is empty, no stream queries exist for that source. - Hover over the Amazon Athena data source, and click + New Stream. The Stream Query page is displayed.
Note: The default stream name includes a timestamp of when the stream was created. You can edit the name. - Define the stream settings, as required. See Creating a Stream Query from a Database for further information.
Policy example
{
"Version": "2012-10-17",
"Statement": [
{
"Action": [
"athena:ListWorkGroups",
"athena:ListDataCatalogs"
],
"Resource": "*",
"Effect": "Allow"
},
{
"Action": [
"athena:StartQueryExecution",
"athena:GetQueryResults",
"athena:GetQueryResultsStream",
"athena:GetQueryExecution",
"athena:ListTableMetadata",
"athena:GetTableMetadata",
"athena:GetWorkGroup",
"athena:ListDatabases",
"athena:GetPreparedStatement",
"athena:ListPreparedStatements",
"athena:CreatePreparedStatement",
"athena:UpdatePreparedStatement",
"athena:DeletePreparedStatement"
],
"Resource": [
"arn:aws:athena:<region>:<account-id>:datacatalog/AwsDataCatalog",
"arn:aws:athena:<region>:<account-id>:workgroup/<workgroup-name>"
],
"Effect": "Allow"
},
{
"Action": [
"glue:GetTables",
"glue:GetTable",
"glue:GetDatabase",
"glue:GetDatabases",
"glue:GetPartition",
"glue:GetPartitions"
],
"Resource": [
"arn:aws:glue:<region>:<account-id>:catalog",
"arn:aws:glue:<region>:<account-id>:database/<database-name>",
"arn:aws:glue:<region>:<account-id>:table/<database-name>/<table-prefix>*"
],
"Effect": "Allow"
},
{
"Action": "s3:GetBucketLocation",
"Resource": "arn:aws:s3:::<query-results-bucket>",
"Effect": "Allow"
},
{
"Condition": {
"StringLike": {
"s3:prefix": [
"<prefix>",
"<prefix>/*"
]
}
},
"Action": "s3:ListBucket",
"Resource": "arn:aws:s3:::<query-results-bucket>",
"Effect": "Allow"
},
{
"Action": [
"s3:GetObject",
"s3:PutObject",
"s3:ListBucketMultipartUploads",
"s3:ListMultipartUploadParts",
"s3:CreateMultipartUpload"
],
"Resource": "arn:aws:s3:::<query-results-bucket>/<prefix>/*",
"Effect": "Allow"
},
{
"Action": [
"s3:GetBucketLocation",
"s3:GetObject"
],
"Resource": [
"arn:aws:s3:::<data-bucket>",
"arn:aws:s3:::<data-bucket>/<prefix>*"
],
"Effect": "Allow"
},
{
"Condition": {
"StringLike": {
"s3:prefix": [
"<prefix>*"
]
}
},
"Action": "s3:ListBucket",
"Resource": "arn:aws:s3:::<bucket>",
"Effect": "Allow"
}
]
}