Monitoring and metrics

An ACE installation publishes a lot of different metrics that can be used to monitor the system.

ACE publish it's metrics to Prometheus. Any metrics toolchain compatible with Prometheus (e.g. Grafana) can be used to inspect and monitor these. (Look in the examples for some basic Grafana dashboards).

Prometheus

The Prometheus UI is by default available on port 9090 and will allow raw query access to all metrics reported to it since started. This way of accessing the metrics is however not very human friendly. In order to be able to monitor and make sense of the values, we recommend using a tool such as Grafana.

Kinds of metrics

There are a couple of different types of metrics available in an ACE installation.

Machine / VM / Container metrics

Metrics related to the host machines running the ACE application stack (such as CPU and disk space usage) can be exported to Prometheus using the Node Exporter Docker application.

Dropwizard metrics

Metrics provided by Dropwizard (the framework used to host the ACE services) are automatically exported to Prometheus by ACE. These metrics are quite generic and can be used to monitor HTTP requests and responses from the ACE services.

Cache metrics

The metrics are related to the performance and status of the internal ACE application caches.

ace_contentstorage_cache_requests_total

Type: Counter
Labels: operation, cache_status, result_status

This metric relates to the the memory caches in ACE, typically cache hits/misses, statuses and executed operations. In order words, how well the cache performs.

operation

Denotes the name of the operation that read or modified the cache.

Possible values
assignToSymbolicView
assignToView
createAlias
createContent
getContent
promoteAlias
releaseAlias
removeFromView
remove
resolveOnWorkspace
updateContent
cache_status

Denotes the cache operation status.

Possible values Description
hit The sought entity was found.
miss The sought entity was not found.
update An entity was updated in the cache.
result_status

If the operation completed this label will contain a five-digit status code. If the operation failed and an exception was thrown the label will contain the name of the exception.

Content pipeline metrics

There are two metrics for pipelines: ace_read_pipeline_duration_seconds and ace_write_pipeline_duration_seconds. These metrics are more detailed than the ace_content_service_read_duration_seconds and ace_content_service_write_duration_seconds metrics (see below) as they measure stages of the pipeline and individual callbacks.

ace_read_pipeline_duration_seconds

Type: Histogram
Labels: variant, contentType, stage, callback

Measures the pipeline execution time in seconds when reading content.

variant

Denotes the variant in which the content was read. If no variant was requested the value (no variant) is used.

contentType

Denotes the content type of the content being read. In case the content type could not be determined, for instance if the content was not found, the value (no contentType) is used.

stage

Denotes the specific stage of the pipeline.

Possible values Description
aspectMapping The stage in which aspect mappers are executed.
contentComposing The stage in which content composers are executed.
callback

Denotes the ID of the specific aspect mapper or content composer being executed. If not callback was executed the value (no callback) is used.

ace_write_pipeline_duration_seconds

Type: Histogram
Labels: contentType, stage, callback

Measures the pipeline execution time in seconds when writing content.

contentType

Denotes the content type of the content being written.

stage

Denotes the specific stage of the pipeline.

Possible values Description
assignMainAlias The stage in which main alias is being assigned.
copyFiles The stage in which files stored in tmp space is being copied to content space.
writeComposer The stage in which write composers are executed.
preStore The stage in which pre store hooks are executed.
callback

Only applicable in stage writeComposer and preStore. Denotes the ID of the write composer or pre store hook being executed.

Content Storage Couchbase metrics

The metric named as ace_couchbase_storage_duration_seconds will allow you to gather information about how well the ACE content storage operations are performing.

ace_couchbase_storage_duration_seconds

Type: Histogram
Labels: operation, suboperation

Measures the content storage operation execution time in seconds.

operation

Denotes the specific operation.

Possible values
assignToView
assignToSymbolicView
removeFromView
createContent
getContent
getHangerInfoNB
updateContent
resolveOnWorkspace
createAlias
promoteAlias
releaseAlias
remove
getWorkspaceInfo
suboperation

Denotes the specific sub operation.

Possible values
reserveAliases
aliasesAvailable
promoteAlias
couchbaseGet
hangerWrites
updateHangerInfo
cleanupAfterConflict
createContent
updateData
writeImmutableData
resolve
couchbaseRemove
remove
query

ACE service metrics

Most ACE services provide their own metrics. The services and their published metrics are listed below.

All ACE service metrics have some common properties:

  • They are all Histograms types.
  • They all measure execution time in seconds.
  • They all have two common labels: operation and status.

The operation label contains the name of the invoked operation, for instance getRevision. Each metric lists all its possible operation values.

The status label contains a status code for the operation. If the operation succeeded the HTTP response code will be used, for instance 200. If the operation failed a detailed five digit status code will be used, for instance 40401 (Not found on view).

NOTE: Metrics for the ACE services are currently not reported when accessed from within content pipeline callbacks.

Content Service

ace_content_service_read_duration_seconds

Type: Histogram
Labels: endpoint, status, variant, contentType

Measures the execution time in seconds for content read operations.

operation
Possible values Description
getVersion Reads a specific content version.
getContentRedirect Reads latest revision of a content. Will redirect to 'getRevision'.
getRevision Reads a specific content revision.
getContentWithAliasFromView Reads the content version assigned to a specific view.
variant

Denotes the variant in which the content was read. If no variant was requested the value (no variant) is used.

contentType

Denotes the content type of the content being read. In case the content type could not be determined, for instance if the content was not found, the value (no contentType) is used.

ace_content_service_write_duration_seconds

Type: Histogram
Labels: operation, status

Measures the execution time in seconds for content write operations.

operation
Possible values Description
updateContent Updates an existing content.
createContent Creates a new content.
deleteContent Deletes an existing content.
ace_content_service_meta_duration_seconds

Type: Histogram
Labels: operation, status

Measures the execution time in seconds for content metadata read operations.

operation
Possible values Description
getMetadata Reads the unversioned metadata of a content.
getHistory Reads the history of a content.
ace_content_service_import_duration_seconds

Type: Histogram
Labels: operation, status, force

Measures the execution time in seconds for content import operations.

operation
Possible values Description
importContent Imports a single content in content-JSON format.
importSingleFile NOT YET IMPLEMENTED. See below.
importJar Imports a jar containing files in content-JSON and/or semantic file endings.

Note: The importSingleFile operation supports importing single files with semantic file endings (such as .contentType and .prestore.js). This service operation does not yet report any metrics.

force

Corresponds to the import parameter force.

Possible values Description
false Contents will be imported only if new or changed.
true Contents will be re-imported even if they have not changed.
ace_content_service_workspace_duration_seconds

Type: Histogram
Labels: operation, status, workspace

Measures execution time in seconds for all operations performed on a workspace.

operation
Possible values Description
createOnWorkspace Creates a content on a workspace.
getWorkspaceContentWithAlias Reads a content from a workspace.
updateOnWorkspace Updates a content on a workspace.
deleteFromWorkspace Deletes a content from a workspace.
getWorkspaceInfo Retrieves information about a workspace.
clearWorkspace Removes all contents from a workspace.
promote Moves a content from a workspace to main storage.
releaseAlias Removes a previously claimed alias.
workspace

Denotes the workspace ID used for the service operation.

ace_type_service_duration_seconds

Type: Histogram
Labels: operation type, recursive

Measures execution time in seconds for the getType service operation.

operation
Possible values Description
getType Returns information on the type identified by {type}.
type

Denotes the name of the retrieved type.

recursive

Denotes whether or all sub types should be retrieved along with the requested type.

ace_type_service_list_duration_seconds

Type: Histogram
Labels: operation, verbose, isContentType

Measures execution time in seconds for the getTypes operation.

operation
Possible values Description
getTypes Lists all types available in the system.
verbose

Denotes whether or not the type names in the list of returned types should be replaced with the actual type definitions.

isContentType

Denotes whether or not to only return types that are also content types.

Search Service

ace_search_service_duration_seconds

Type: Histogram
Labels: operation, status, variant, inlineData

Measures the execution time in seconds for search operations.

operation
Possible values Description
searchByPost Performs a search against the {collection} collection.
search Performs a search against the {collection} collection.
collection

Denotes the name of the collection the search operation was performed against.

variant

Denotes the variant ID used for the search operation.

inlineData

Denotes whether or not inline content result data was requested for the search operation.

File Service

ace_file_service_operation_duration_seconds

Type: Histogram
Labels: operation, status, host, space

Measures the execution time in seconds for content file operations.

operation
Possible values Description
getFileInfo Delivers file information for the content file stored in {space}, {host}, {path}.
uploadFile Uploads a file into the storage space {space}.
uploadFileWithPath Uploads a file into {space}, {host}, {path}.
getFile Delivers the conten tfile stored in {space}, {host}, {path}.
space

Denotes the storage space from which to retrieve the content file.

host

Denotes the storage host name from which to retrieve the content file.

Image Service

ace_image_service_duration_seconds

Type: Histogram
Labels: operation, status

Measures the execution time in seconds for image operations.

operation
Possible values Description
getImageByFileServiceLocation Delivers a working copy image from the image file {path} in the {scheme} storage space.
getImageByAlias Delivers an image from the image file {path} from the content identified by {alias}.
getImageByVersion Delivers an image from the image file {path} from the content version {version}.

Taxonomy Service

ace_taxonomy_service_duration_seconds

Type: Histogram
Labels: operation, status, depth

Measures execution time in seconds for taxonomy operations.

operation
Possible values Description
getObject Retrieves the taxonomy identified by {id}.
listTaxonomies Lists all taxonomies available in the system.
depth

Only applicable for the getTaxonomy operation. Denotes how many levels of the taxonomy tree to return data for.

File Delivery Service

ace_file_delivery_service_duration_seconds

Type: Histogram
Labels: operation, status

Measures execution time in seconds for content file delivery operations.

operation
Possible values Description
getFileByVersion Delivers the content file {path} from the content version {id}.
getFileByAlias Delivers the content file {path} from the content identified by {namespace} and {id}.

ACE indexer metrics

The ACE indexer is the service responsible for indexing content version and metadata data into Solr in order to make it searchable.

ace_indexer_indexing_duration_seconds

Type: Histogram
Labels: collection

Measures execution time in seconds for indexing operations.

collection

Denotes the name(s) of the Solr collection(s) which the indexer is updating.

ace_indexer_contents_total

Type: Counter
Labels: collection, result

Measures the total number of content processed by indexer.

collection

Denotes the name(s) of the Solr collection(s) which the indexer is updating.

result
Possible values
success
skipped
failure