• /
  • Log in
  • Free account

Establish data ingest governance roles & practices

Designate a few explicit roles and practices in order to ensure a level of context and accountability in your ingest planning. At a minimum select a Governance Team and schedule check-ins throughout the year to plan and adapt as needed.

Stakeholders and participants

Regardless of how your organization's teams are structured it is necessary to identify some individuals who will participate in the data governance process. Selection of the team can be ad hoc but it should include representation from a broad enough cross section of teams so that when priorities and decisions must be made you will have the right mix of knowledge and authority. The team should have one individual who can be considered the overall observability manager. This may be the person who manages the New Relic account or an overriding team leader responsible for the systems and infrastructure monitored by New Relic.

Observability manager

Observability Manger

This is the go-to person to help resolve conflicts and to communicate with senior management as needed. When the organization contains gray areas of ownership leading to questions like "Who owns this kubernetes cluster?" and "Why is it sending so much data this week?" this individual is instrumental. The observability manager will be able to interact with technical individual contributors as well as senior management as needed. The observability manager must be able to foster consensus and cooperation when tough decisions are needed.

The governance team

Governance Team

The Observability manager functions as the lead for this team. The members of the governance team bring in practical technical knowledge of the systems and services that are monitored in New Relic. They may be peers or direct reports of the observability manager. Sharing a common goal of high quality observabilty for the entire organization (transcends team or business unit). If you have a pre-existing structure such as an Observability Center of Excellence (OCoE) your governance team can be comprised primarily from the OCoE Core Team.

The primary responsibilities of a OCoE team generally are:

  • Maintain the relationship with New Relic.
  • Govern accounts and users.
  • Onboard new teams and individuals.
  • Maintain an observability knowledge base.
  • Promote collaboration and sharing among teams.

Incorporating data governance adds the following responsibilities:

  • Work with the observability manager to stay within monthly ingest targets.
  • Monitor data ingest baselines and respond to anomalies.
  • Draft and approve plans for data optimization/reduction as needed.
  • Participate in scheduled check-ins where baseline data is analyzed and compared to ingest targets.
  • Make modifications to ingest targets as needed.

Timelines and check ins

Schedule data ingest governance meetings through the year to keep everyone up to date on data ingest volumes. Doing so makes data ingest governance predictable and easy to manage.

Yearly ingest target planning

Yearly Ingest Target Planning

Meet to maintain an organization wide telemetry ingest target. This can be broken out into as many facets as is useful for your organization. For example you may adopt the following ingest targets...

  • Organization Wide (Monthly): 1000TB
  • Team A (Monthly): 500TB
  • Team B (Monthly): 300TB
  • Team C (Monthly): 100TB

This rough set of targets leaves 100TB as a buffer for uncertainty. You may also choose some telemetry specific limits based on certain highly variable telemetry. For example you may set organization or team based limits on Log or Metrics ingest.

Monthly ingest check-ins

Monthly Ingest Checkin

During these sessions you'll track ingest against your plan and produce action items needed to stay on target. Using the target examples discussed above we'll want to know if teams A, B, and C are meeting their agreed ingest targets. If something is out of alignment the governance team will suggest an optimization plan.

Ad hoc anomaly resolution

Ad Hoc Anomaly Resolution

Generally these sessions are reserved for events that if left unattended would substantially impact the organization's budget. There are numerous causes for these anomalies. Some scenarios to watch for:

  • A new software deployment increases log volume by 3x.
  • A team enables a handful of new cloud integrations that unexpectedly increases metrics ingest by 200%.
  • An acquisition of a new company leads to an increase in overall telemetry volume.
  • Peak business season activity combined with some pre-peak refactors results in a much higher than expected custom events volume.

The optimizating section of this guide will provide a structured approach for assessing these anomalies and taking possible action.

Create issueEdit page
Copyright © 2022 New Relic Inc.