Insights Data Schemas
[Front page] [External data pipeline]
2 (unofficial)
That form of data is produced by ccx_data_pipeline
service, which provides a
publisher class that can send the generated reports to selected Kafka topic.
This class is named ccx_data_pipeline.kafka_publisher.KafkaPublisher
and its
source code can be found in the service repository (see the link below this
paragraph). The report generated by the framework are enhanced with more
context information taken from different sources, like the organization ID,
account number, unique cluster name, and the LastChecked
timestamp (taken from
the incoming Kafka record containing the URL to the archive).
Other relevant information about ccx_data_pipeline
can be found on address
https://redhatinsights.github.io/ccx-data-pipeline/.
Data produced by ccx_data_pipeline
is in JSON format with the following five
top-level required attributes:
OrgID
(positive integer with organization ID)AccountNumber
(positive integer with account number)ClusterName
(string containing UUID with cluster name)Report
(nested JSON-like structure that contains results of rule execution)LastChecked
: (timestamp with TZ info stored as a string)NOTE
All required attributes are described in more details below, including the
Report
structure.
Metadata
(object): if the input message received from Ingress contains
some relevant metadata to CCX, it will be extracted and sent without any
change as Metadata
field.Version (positive integer) should be included in the message so the schema change will be possible w/o breaking other services and tools.
The JSON generated by ccx_data_pipeline
service that is being produced into
Kafka topic has the following format:
{
"OrgID": 123456, // (int) - number that we get from b64_identity field
"AccountNumber": 223344, // (int) - number that we get from b64_identity field
"ClusterName": "aaaaaaaa-bbbb-cccc-dddd-000000000000", // (string) - cluster UUID that we read from URL
"Report": {...} // nested JSON structure that contains results of executing rules,
"LastChecked": "2020-01-23T16:15:59.478901889Z" // (string) - time of the archive uploading in ISO 8601 format, gotten from "timestamp" field
}
The JSON generated by ccx-data-pipeline
can contain an optional key:
{
"OrgID": 123456, // (int) - number that we get from b64_identity field
"AccountNumber": 223344, // (int) - number that we get from b64_identity field
"ClusterName": "aaaaaaaa-bbbb-cccc-dddd-000000000000", // (string) - cluster UUID that we read from URL
"Report": {...} // nested JSON structure that contains results of executing rules,
"LastChecked": "2020-01-23T16:15:59.478901889Z", // (string) - time of the archive uploading in ISO 8601 format, gotten from "timestamp" field
"Metadata": {
"gathering_time": "2020-01-23T16:15:59.478901889Z"
}
}
The keys and values of the Metadata
are not relevant at this point, as all of
them will be sent to the Insights Results Aggregator, that will read and parse
them appropriately.
NOTE
ClusterName
uses its canonical textual representation: the 16 octets of a
UUID are represented as 32 hexadecimal (base-16) digits, displayed in five
groups separated by hyphens, in the form 8-4-4-4-12 for a total of 36
characters (32 hexadecimal characters and 4 hyphens).
An example of UUID:
3ba9b042-b8b8-4714-98e9-17915c2eeb95
NOTE
The LastChecked
attribute is a timestamp containing the zone designator Z
(aka “Zulu time” or more informally “Greenwich Mean Time”)
An example of timestamp with the zone designator:
2020-04-02T09:00:05.268294Z
OrgID
: retrieved from the incoming JSON, codified inside the b64_identity
value. It is extracted from identity-internal-org_id
path of keys. Please look
at the document Incoming messages in platform.upload.announce
for more information about messages containing OrgID
among other info.AccountNumber
: retrieved from the incoming JSON, codified inside the b64_identity
value. It is extracted from identity-account_number
path of keys. Please look
at the document Incoming messages in platform.upload.announce
for more information about messages containing AccountNumber
among other info.ClusterName
: the cluster name is retrieved from the downloaded archive. When
the download successes and the archive is extracted prior to its processing by
the engine, the cluster ID is read from a file named config/id
. Please look
at the document Raw data stored in S3 bucket for more
information about S3 objects containing ClusterName
among other info.Report
: is the JSON generated by the engine when the archive is processed.
Its basic structure is mentioned below.LastChecked
: this field is copied directly from the incoming Kafka record,
as timestamp
key.Report
nodeThe generated cluster reports from Insights results contain three lists of
rules that were either skipped (because of missing requirements, etc.),
passed (the rule got executed but no issue was found), or hit (the rule
got executed and found the issue it was looking for) by the cluster, where each
rule is represented as a dictionary containing identifying information about
the rule itself (actually hit rules are stored in reports
attribute).
Report
node is represented as a standard JSON dictionary with following five required attributes:
system
: additional information about the clusterreports
: list of rules that detect any problem on given clusterpass
: list of rules that passes all conditions (i.e. rules without any problem/issue detected)info
: list of rules that return info messages onlyfingerprints
: ?Optionally, it can contain a skips
attribute, which contains a list of rules that have been
skipped because not all required information was available on checked cluster.
system
attribute in Report
nodeTBD (not used in external data pipeline)
reports
attribute in Report
nodeThis attribute contains list of rules that detect any problem on given cluster. Each element in the list is represented as a node with seven attributes:
rule_id
: rule name and a keycomponent
: fully-qualified name of the rule (unique)type
: information that issue or issues have been detected by this rulekey
: a key that selects the variant of issue (one rule can detect more different issues)tags
: tags assigned to the rulelinks
: links to documentation, Knowledge Base article etc.An example:
"reports": [
{
"rule_id": "tutorial_rule|TUTORIAL_ERROR",
"component": "ccx_rules_ocp.external.tutorial_rule.report",
"type": "rule",
"key": "TUTORIAL_ERROR",
"details": {
"type": "rule",
"error_key": "TUTORIAL_ERROR"
},
"tags": [],
"links": {}
}
]
skips
attribute in Report
nodeThis attribute contains list of rules that have been skipped because not all required information was available on checked cluster. Each element in the list is represented as a node with four attributes:
rule_fqdn
: fully-qualified name of the rule (unique)reason
: reason why the rule was skippeddetails
: detailed information about the rule and the condition to skip ittype
: information that this rule was skippedAn example:
"skips": [
{
"rule_fqdn": "ccx_rules_ocp.ocs.check_ocs_version.report",
"reason": "MISSING_REQUIREMENTS",
"details": "All: ['ccx_ocp_core.specs.must_gather_ocs.OperatorsOcsMGOCS'] Any: ",
"type": "skip"
},
{
"rule_fqdn": "ccx_rules_ocp.ocs.check_pods_scc.report",
"reason": "MISSING_REQUIREMENTS",
"details": "All: ['ccx_ocp_core.specs.must_gather_ocs.PodsMGOCS'] Any: ",
"type": "skip"
}
]
pass
attribute in Report
nodeThis attribute contains list of rules that passes all conditions (i.e. rules without any problem/issue detected)
info
attribute in Report
nodeTBD
fingerprints
attribute in Report
nodeTBD (not used in external data pipeline)
Report
nodeReport
node can contain attributes with empty values. Its minimal structure can look like:
{
"Report": {
"system": {
"metadata": {},
"hostname": null
},
"reports": [],
"fingerprints": [],
"info": [],
"pass": []
}
}
In the external data pipeline, if any of these attributes is missing, the report will be considered malformed, and will not be processed nor stored by the db-writer, the component responsible for storing the reports’ data so that we can serve the relevant recommendations via REST APIs.
The following message contains just empty report without any rule hits nor rule skips nodes:
{
"OrgID": 12345678,
"AccountNumber": 2233445,
"ClusterName": "aaaaaaaa-bbbb-cccc-dddd-0123456789ab",
"LastChecked": "2020-04-02T09:00:05.268294Z",
"Report": {
"system": {
"metadata": {},
"hostname": null
},
"reports": [],
"fingerprints": [],
"skips": [],
"info": [],
"pass": []
}
}
The following message contains report with two rule skips but not any rule hits:
{
"OrgID": 12345678,
"AccountNumber": 2233445,
"ClusterName": "aaaaaaaa-bbbb-cccc-dddd-0123456789ab",
"LastChecked": "2020-04-02T09:00:05.268294Z",
"Report": {
"system": {
"metadata": {},
"hostname": null
},
"reports": [],
"fingerprints": [],
"skips": [
{
"rule_fqdn": "ccx_rules_ocp.ocs.check_ocs_version.report",
"reason": "MISSING_REQUIREMENTS",
"details": "All: ['ccx_ocp_core.specs.must_gather_ocs.OperatorsOcsMGOCS'] Any: ",
"type": "skip"
},
{
"rule_fqdn": "ccx_rules_ocp.ocs.check_pods_scc.report",
"reason": "MISSING_REQUIREMENTS",
"details": "All: ['ccx_ocp_core.specs.must_gather_ocs.PodsMGOCS'] Any: ",
"type": "skip"
}
],
"info": [],
"pass": []
}
}
A typical message for a node “hit” just by so-called tutorial rule. Additionally two other rules was skipped:
{
"OrgID": 12345678,
"AccountNumber": 2233445,
"ClusterName": "aaaaaaaa-bbbb-cccc-dddd-0123456789ab",
"LastChecked": "2020-04-02T09:00:05.268294Z",
"Report": {
"system": {
"metadata": {},
"hostname": null
},
"reports": [
{
"rule_id": "tutorial_rule|TUTORIAL_ERROR",
"component": "ccx_rules_ocp.external.tutorial_rule.report",
"type": "rule",
"key": "TUTORIAL_ERROR",
"details": {
"type": "rule",
"error_key": "TUTORIAL_ERROR"
},
"tags": [],
"links": {}
}
],
"fingerprints": [],
"skips": [
{
"rule_fqdn": "ccx_rules_ocp.ocs.check_ocs_version.report",
"reason": "MISSING_REQUIREMENTS",
"details": "All: ['ccx_ocp_core.specs.must_gather_ocs.OperatorsOcsMGOCS'] Any: ",
"type": "skip"
},
{
"rule_fqdn": "ccx_rules_ocp.ocs.check_pods_scc.report",
"reason": "MISSING_REQUIREMENTS",
"details": "All: ['ccx_ocp_core.specs.must_gather_ocs.PodsMGOCS'] Any: ",
"type": "skip"
},
],
"info": [],
"pass": []
}
}
A message returned for cluster with one real rule hit (it is not tutorial rule):
{
"OrgID": 12345678,
"AccountNumber": 2233445,
"ClusterName": "aaaaaaaa-bbbb-cccc-dddd-0123456789ab",
"LastChecked": "2020-04-02T09:00:05.268294Z",
"Report": {
"system": {
"metadata": {},
"hostname": null
},
"reports": [
{
"rule_id": "nodes_requirements_check|NODES_MINIMUM_REQUIREMENTS_NOT_MET",
"component": "ccx_rules_ocp.external.rules.nodes_requirements_check.report",
"type": "rule",
"key": "NODES_MINIMUM_REQUIREMENTS_NOT_MET",
"details": {
"nodes": [
{
"name": "foo1",
"role": "master",
"memory": 8.16,
"memory_req": 16
}
],
"link": "https://docs.openshift.com/container-platform/4.1/installing/installing_bare_metal/installing-bare-metal.html#minimum-resource-requirements_installing-bare-metal",
"type": "rule",
"error_key": "NODES_MINIMUM_REQUIREMENTS_NOT_MET"
},
"tags": [],
"links": {
"docs": [
"https://docs.openshift.com/container-platform/4.1/installing/installing_bare_metal/installing-bare-metal.html#minimum-resource-requirements_installing-bare-metal"
]
}
}
],
"fingerprints": [],
"skips": [
{
"rule_fqdn": "ccx_rules_ocp.ocs.check_ocs_version.report",
"reason": "MISSING_REQUIREMENTS",
"details": "All: ['ccx_ocp_core.specs.must_gather_ocs.OperatorsOcsMGOCS'] Any: ",
"type": "skip"
}
],
"info": [],
"pass": []
}
}
A message returned for cluster with one real rule hit (it is not tutorial rule) and metadata included:
{
"OrgID": 12345678,
"AccountNumber": 2233445,
"ClusterName": "aaaaaaaa-bbbb-cccc-dddd-0123456789ab",
"LastChecked": "2020-04-02T09:00:05.268294Z",
"Report": {
"system": {
"metadata": {},
"hostname": null
},
"reports": [
{
"rule_id": "nodes_requirements_check|NODES_MINIMUM_REQUIREMENTS_NOT_MET",
"component": "ccx_rules_ocp.external.rules.nodes_requirements_check.report",
"type": "rule",
"key": "NODES_MINIMUM_REQUIREMENTS_NOT_MET",
"details": {
"nodes": [
{
"name": "foo1",
"role": "master",
"memory": 8.16,
"memory_req": 16
}
],
"link": "https://docs.openshift.com/container-platform/4.1/installing/installing_bare_metal/installing-bare-metal.html#minimum-resource-requirements_installing-bare-metal",
"type": "rule",
"error_key": "NODES_MINIMUM_REQUIREMENTS_NOT_MET"
},
"tags": [],
"links": {
"docs": [
"https://docs.openshift.com/container-platform/4.1/installing/installing_bare_metal/installing-bare-metal.html#minimum-resource-requirements_installing-bare-metal"
]
}
}
],
"fingerprints": [],
"skips": [
{
"rule_fqdn": "ccx_rules_ocp.ocs.check_ocs_version.report",
"reason": "MISSING_REQUIREMENTS",
"details": "All: ['ccx_ocp_core.specs.must_gather_ocs.OperatorsOcsMGOCS'] Any: ",
"type": "skip"
}
],
"info": [],
"pass": []
},
"Metadata": {
"gathering_time": "2020-04-02T08:58:25.168949Z"
}
}