New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Adding Metrics section to capabilities in understanding domain #1068

Open

KevDLR wants to merge 5 commits into features/mslearn from KevDLR/features/mslearn/docs

Contributor

KevDLR commented Oct 18, 2024 •

edited by flanakin

Loading

🛠️ Description

Adding a Metrics/ KPI section to the capability to provide guidance on the Metric lens of the FinOps assessment.

Fixes
N/A

📋 Checklist

🔬 How did you test this change?

🤏 Lint tests

🤞 PS -WhatIf / az validate

👍 Manually deployed + verified

💪 Unit tests

🙌 Integration tests

🙋‍♀️ Do any of the following that apply?

🚨 This is a breaking change.

🤏 The change is less than 20 lines of code.

📑 Did you update `docs/changelog.md`?

✅ Updated changelog (required for dev PRs)

➡️ Will add log in a future PR (feature branch PRs only)

❎ Log not needed (small/internal change)

📖 Did you update documentation?

✅ Public docs in docs (required for dev)

✅ Internal dev docs in src (required for dev)

➡️ Will add docs in a future PR (feature branch PRs only)

❎ Docs not needed (small/internal change)


          Data Ingestion Metrics

KevDLR requested a review from flanakin

October 18, 2024 17:09

microsoft-github-policy-service bot requested a review from arthurclares

October 18, 2024 17:09

microsoft-github-policy-service bot added the Needs: Review 👀 label

microsoft-github-policy-service bot assigned arthurclares and flanakin

microsoft-github-policy-service bot added the Tool: FinOps guide label

flanakin requested changes

View reviewed changes

Collaborator

flanakin left a comment

This looks good. I haven't looked at it from a completeness perspective, but I tink it's a great list! My comments are mostly around landing the right way to bring metrics into the guide altogether.

docs-mslearn/framework/understand/ingestion.md Outdated

		@@ -108,6 +108,21 @@ At this point, you have a data pipeline and are ingesting data into a central da

		<br>

		## Data Ingestion Metrics

Collaborator

flanakin Oct 18, 2024

Let's use the same headers across all files so we can link to them generically. Also note we should use sentence casing rather than title casing to align with the Microsoft Style Guide.

Suggested change

      
            ## Data Ingestion Metrics
          
            ## KPIs and metrics

docs-mslearn/framework/understand/ingestion.md

               <br>
+              ## Data Ingestion Metrics
+              | **Category** | **Definition** | **KPI** |

Collaborator

flanakin Oct 18, 2024

Update all of the categories to be sentence cased to align to the Microsoft Style Guide.

docs-mslearn/framework/understand/ingestion.md

               <br>
+              ## Data Ingestion Metrics
+              | **Category** | **Definition** | **KPI** |

Collaborator

flanakin Oct 18, 2024

Each KPI should include a formula. We may not be able to format this as a table.

docs-mslearn/framework/understand/ingestion.md Outdated

+              | **Category** | **Definition** | **KPI** |
+              |----------|-----------|-----|
+              | Data Completeness | Measures the extent to which all required data fields are present in the dataset and tracks the overall data completeness trend over a specified period.| Percentage of data fields that are complete and the overall data completeness over time. |

Collaborator

flanakin Oct 18, 2024

How feasible is it to measure this? I'm not pushing back. It sounds like the right thing to do, but do they have a way to actually measure it? How would we calculate this for them? Should we outline any potential challenges they may have in collecting this to give them a heads up? I'd hate for someone to take this list and say, "let's go track all these" and then realize there's no way to do it.

docs-mslearn/framework/understand/ingestion.md

               <br>
+              ## Data Ingestion Metrics
+              | **Category** | **Definition** | **KPI** |

Collaborator

flanakin Oct 18, 2024

Can you add each one of these into the backlog for adding to Power BI?

docs-mslearn/framework/understand/ingestion.md

+              |Data Ingestion Frequency | Measures how often data is ingested into the system. | Number of data ingestion events per unit of time (daily, weekly, monthly, quarterly, annually). |
+              | Volume of Data Ingested | Measures the total volume of data ingested into the repository. | Total volume of data ingested into the repository.  |
+              | Growth Rate | Measure the rate at which the volume of data ingested is increasing over time. | Percentage increase of total data volume in repository per unit of time. |
+              | Ingestion Latency | Measures the average time taken for data to be ingested into the repository and tracks the trend of this latency over a specified period. | Mean time of data ingestion latency and the latency trend over a specified period. |

Collaborator

flanakin Oct 18, 2024

I like this one. A few thoughts:

Do we need to call out that latency may differ by dataset?
Do you intend to use "mean" time? Not average or percentile? All have merits, so just confirming.
This can likely be split into multiple KPIs.
Is latency trend a KPI or a visualization of a KPI over time? Not sure if visualizations need to be called out here unless we need to speak to the value of the visual. I'm open to either approach. Just thinking out loud to keep this simple. If we do keep it, it's probably a separate KPI that might be better if we can quantify a single number for it. Not sure 🤔

docs-mslearn/framework/understand/ingestion.md Outdated

+              | Volume of Data Ingested | Measures the total volume of data ingested into the repository. | Total volume of data ingested into the repository.  |
+              | Growth Rate | Measure the rate at which the volume of data ingested is increasing over time. | Percentage increase of total data volume in repository per unit of time. |
+              | Ingestion Latency | Measures the average time taken for data to be ingested into the repository and tracks the trend of this latency over a specified period. | Mean time of data ingestion latency and the latency trend over a specified period. |
+              | Historical Data Availability | Measures the lookback period of data that is ingested and available for analysis. | Span of historical data ingested. |

Collaborator

flanakin Oct 18, 2024

This name needs some work, but I do like it. I've thought about this one as well. We need to know what data is missing so we can backfill it. Should this be bound to months with complete data over the retention/reporting period?

docs-mslearn/framework/understand/ingestion.md Outdated

+              | Volume of Data Ingested | Measures the total volume of data ingested into the repository. | Total volume of data ingested into the repository.  |
+              | Growth Rate | Measure the rate at which the volume of data ingested is increasing over time. | Percentage increase of total data volume in repository per unit of time. |
+              | Ingestion Latency | Measures the average time taken for data to be ingested into the repository and tracks the trend of this latency over a specified period. | Mean time of data ingestion latency and the latency trend over a specified period. |
+              | Historical Data Availability | Measures the lookback period of data that is ingested and available for analysis. | Span of historical data ingested. |

Collaborator

flanakin Oct 18, 2024

This brings up a question about how much people are using historical data. We should probably talk about the cost of each month of data compared to the usage of that data. If people aren't using it, then that's wasted money. That will also help them quantify the value of storing the historical data.

docs-mslearn/framework/understand/ingestion.md

               <br>
+              ## Data Ingestion Metrics
+              | **Category** | **Definition** | **KPI** |

Collaborator

flanakin Oct 18, 2024

Can you think about the cost and carbon impact of each one of these? It may not apply everywhere. Anything that comes back to something that is metered, like data size or compute time.

docs-mslearn/framework/understand/ingestion.md Outdated

+              | Growth Rate | Measure the rate at which the volume of data ingested is increasing over time. | Percentage increase of total data volume in repository per unit of time. |
+              | Ingestion Latency | Measures the average time taken for data to be ingested into the repository and tracks the trend of this latency over a specified period. | Mean time of data ingestion latency and the latency trend over a specified period. |
+              | Historical Data Availability | Measures the lookback period of data that is ingested and available for analysis. | Span of historical data ingested. |
+              | Investigation Time to Resolution | Measures the time taken to investigate and resolve data quality or availability issues and tracks the trend of this resolution time over a specified period. | Mean time to investigate and resolve data quality or availability issues, and the trend over time. |

Collaborator

flanakin Oct 18, 2024

Similar comments about trends on this one. It's very interesting. This warrants its own backlog to think thru whether we have the right guidance to support it.

microsoft-github-policy-service bot added Needs: Attention 👋 and removed Needs: Review 👀 labels

microsoft-github-policy-service bot assigned KevDLR


          Allocation Metrics

bf91fba

microsoft-github-policy-service bot requested a review from flanakin

October 18, 2024 19:55

microsoft-github-policy-service bot added Needs: Review 👀 and removed Needs: Attention 👋 labels

KevDLR changed the title ~~Adding Metrics section to Data Ingestion capability~~ Adding Metrics section to capabilities

Kevin De La Rosa added 2 commits

October 18, 2024 13:18


          Updated Category column to match style guide

98c0bc6


          Reporting Metrics

7cfa67d

KevDLR changed the title ~~Adding Metrics section to capabilities~~ Adding Metrics section to capabilities in understanding domain


          Anomaly Metrics

2306b40

Contributor Author

KevDLR commented Oct 23, 2024

@KevDLR please read the following Contributor License Agreement(CLA). If you agree with the CLA, please reply with the following information.
@microsoft-github-policy-service agree [company="{your company}"]
Options:

(default - no company specified) I have sole ownership of intellectual property rights to my Submissions and I am not making Submissions in the course of work for my employer.
@microsoft-github-policy-service agree
(when company given) I am making Submissions in the course of work for my employer (or my employer has intellectual property rights in my Submissions by contract or applicable law). I have permission from my employer to make Submissions and enter into this Agreement on behalf of my employer. By signing below, the defined term “You” includes me and my employer.
@microsoft-github-policy-service agree company="Microsoft"
Contributor License Agreement

@microsoft-github-policy-service agree company="Microsoft"

flanakin added this to the Guide - Build-out milestone

flanakin added Needs: Attention 👋 and removed Needs: Review 👀 labels

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Needs: Attention 👋 Tool: FinOps guide