Defining what is real-time #8

castelao · 2023-02-26T18:18:26Z

In the lifetime of a glider mission, which phases this document covers? We shall have that clearly defined to guide the reader but also to help in our decisions.

Proponents: @castelao
Moderator: @OceanGlidersCommunity/rtqc-maintainers

Type of PR

Typo without possible change of interpretation of the related text.
Fix of some error, inconsistency, unforeseen limitation.
Style that only affects visually the compiled document
Addition that does not require change in the current structure.
Enhancement that require changes to improve the format.

Related Issues

Dates when it got review approvals

Release checklist

Approved by at least two members of the committee?
There were modifications after the review approvals? If so, please
ask reviewers to update their review.
Proponents and moderador should explicitly agree that it is ready to
to merge.
The moderador is the one in charge to actually merge or close this PR
according to the final decision.

For maintainers

Update the moderator with a volunteer from the committee. It would be
best to have one single moderator to guide and help this PR to move
forward. It is OK to update the moderador pass it to another one.
Confirm that the associated branch was deleted after the merging.
Wrap-up and close the related issues.

Comments

Although we had a general agreement here, there are a few points that are worth confirming with the team. For that reason, I split it into its own PR. I'm copying here all the discussions and comments from async meetings back around May 2021, so we have everything related in a single place.

What defines “Real-time” QC?
What defines a QC procedure as “real-time” or “near real-time” for underwater gliders in contrast to the delayed mode QC? Due to satellite bandwidth, reflecting on cost and surface time, we will never have all the engineer or scientific information in land for real-time, but we will need to wait for the recovery to download all the available parameters on full resolution. Therefore, we should always expect limited and incomplete information and we will do the best we can. The real-time nature of this data pipeline also imposes limitations, in comparison with the delayed mode, on support information for decision on the QC: Lacking measurements from other platforms for comparison (include satellite and numerical simulations/renalysis); post-deployment calibrations; Time history (important to detect drift);
Let’s define real-time & near real-time QC as the QC done until the end of a deployment/mission, and after that stage it will be a delayed mode QC job. Even if the glider is lost, other support information will be available for the delayed mode.

Mark: I see five temporal QC stages, with the level of QC improving with each step:
Real time (usually seconds to hours) - data are acquired, QC’d, and disseminated without delay for immediate application (model ingestion, resource manager decisions, etc.). The unique challenge to RT QC is that the most current data point is being evaluated.
Near real time (hours to days) - QC of a recent portion of a time series. Timeliness is a factor but some latency can be tolerated and the QC improved by evaluation of data other than the most recent data point. I try to avoid the use of this term unless it’s very explicitly defined (it usually isn’t).
Delayed mode (days to months) - By design, dissemination is delayed so that better and more efficient QC can be applied.
Post-processed - (months to years) - conducted upon completion of a mission, and may include post calibration corrections.
Reanalysis - (years to decades) - multiple related data sets and models are used to create the very best QC.
These generic descriptions need refinement for glider applications. If the most recent glider profile is to be disseminated then it’s RT QC. The fact that it may be acquired with several other prior profiles during telemetry doesn’t make it NRT.

Justin: Key thing here is to define the user users and applications for each model of QC.

Nikolaos: We can define separate this time you have apply and perform build in algorithms in your glider processing toolbox or server

Justin:

(in respect to near real time pointed by Mark)
The key question is who is the primary user of the RTQC. This is typically the operational modelling communities who need data within 24 hours, thus procedures need to be automated, and are aware the data have not been through a climate grade QC. They need data that have had a crude QC so it does not adversely affect the data assimilations. That said, the results of RTQC do help to inform subsequent levels of QC, e.g. data flagged 3 (probably bad) would be altered to 4 (bad data) once inspected.
(in respect to delayed mode)
These are the data needed for activities such as IPCC assessments so require the highest feasible QC applying.
(in respect to post-processed)
This is potentially data that are adjusted to reflect sensor wide issues identified on batches of deployments.

Gui: Mark, this is a reasonable timeline. I like it. The reason for my question on what defines the RTQC is to determine what is our “jurisdiction”. What should we be concerned with, and what is beyond our scope.
The fast data pipeline based on automatic procedures is clear. The question is, when do we stop and start the delayed mode procedure? What is the message that we want to transmit to the users of our data? One possibility is that delayed mode would be associated with scientific publication grade data, i.e., the delayed mode is a quality stamp that we assure can be used in publications, while the real-time we did our best so far, and if one cannot wait any longer, use it at your own risk. For instance, if someone finds a trend with the RTQC data, that should not be published until verified from the delayed mode. If someone works with RTQCed data, there is an implicit message that it is expected to later go back and update it with the delayed mode. For Spray, we don’t stamp as delayed mode until we have time to do all pos-deployment checks, complete manual QC, and calibrations when necessary. It could be days if urgent or a couple of months after the recovery, especially on overseas missions when one needs to account for shipping the glider back. The effort on the delayed mode will vary a lot depending on the resources available for the operator, so it might not be possible to specify a clear timeline when it happens.

My suggestion is that delayed mode would be top-grade quality, fully trustable, ready for publication. Hence, it is not possible before the end of the mission and actually can take quite sometime after that. Anything before that would be real-time, maybe include near real-time.

Nikolaos, (in respect to "maybe include near real-time."):
I think the term near real time fit better as there is a very fine balance between what we call real-time and near real time depending on the way that you process or communicate we glider

Gui: Hi everyone, I think this is an important topic to agree on since it can define/influence several other topics. Please, express your opinion here. To clarify, this is the real-time QC discussion group and I want to define here what the discussions here should cover and what should not.
My suggestion is:

Automatic checks for fast response (I guess that everyone agrees with this part)
The possibility of an update, thus called near real-time
- Automatic update once more information is available to better support an evaluation (few more data from the instrument itself such as the following profile, satellite data, numerical simulations, or other platforms, etc).
- Manual, the possibility of a manual assessment. We benefit from close monitoring from pilots and PIs which might identify something
The end of the real-time/near real-time cycle is once the glider is recovered and we have access to the full records with full resolution, and the QC job is passed for the delayed mode procedure. Instead of a purely timeline criteria, it is also considered the quality of the product.
- Scientific publications should not be based on real-time QC without a strong reason for not waiting for the delayed mode QC grade.
- Anyone that download/access a real-time QCed data is expected to later replace those measurements with a delayed mode QCed version.

Gui: So far it is only clear for me the consensus that real-time QC is automatic and done as promptly as possible for minimum delay to transmit the data just received. I’ll close this topic as that so we don’t hold the process, but I’ll open another topic as a placeholder for near real-time in case someone has an opinion about that.

More comments

Thierry: In Ifremer, we use the term "real-time QC" for "automated QC": a series of QC tests that are applied on a dataset by an algorithm that does not need human interaction.
The "automated QC" is the first step for quality control on a dataset (real-time data, recovered data, historical data).
We use the term "delayed mode QC" for quality control procedures with human-expert decisions that include QC flag setting, adjustments on parameters, error on measurement estimates.

To discriminate "Real-time QC" and "Delayed-mode QC", on each parameter of the dataset, we use a parameter_data_mode variable (the temperature variable may be delayed mode while the oxygen variable is still real-time).

Gui: That is a good point Thierry. I think one big difference compared to Argo is that we (expect to) recover our gliders, and that might be milestone to transit from real-time/near real-time to delayed mode. It is a big difference to be able to download everything and has access to the sensors.

I don't have a strong opinion here and think it could go either way, but I think that we, as community, should think about that and adopt one opinion. I think both directions could work equally well.

Christoph: Would that be a good entry point to talk about data product levels like the ones for ICOS https://www.icos-cp.eu/data-services/data-collection/data-levels-quality
?

Gui: Christoph, I'm sorry that I missed your comment before.

That's a good point. What would be the benefit of explicitly defining these levels? If I got the ICOS point, this group would be working with level 1 only, is that correct? I can see the benefits for the data management group adopting this scale to clearly define and refer the stages of the data life, but for this group it is not clear for me what would be the benefit for us. At this point I'm more inclined in simply adopt QC procedure for real--time and near real-time, which is more generic.

If you have a strong opinion that we should adopt it here, please could you develop more what would be the benefits, please? Despite that, I suggest to bring this to the data management group. @[email protected] & @[email protected] , what do you think?

Christoph: Guilherme, I agree with your suggestion to have the data management suggest a scheme. Also, in that graph the data processing levels could be included.

===

Nikolaos, in reference to: "all the engineer or scientific information in land"

I think we should include also the engineer parameters as many groups the have split the engineer team with the scientific. So the engineers/pilots care only for the behaviour of the vehicle

Gui: Make sense to me. I agree with you

It is important to clearly define what this document covers in the 'life' of a glider operation

nizaroka · 2023-06-23T08:05:11Z

src/RealtimeQC.adoc

@@ -24,6 +22,9 @@ Other types of platforms with different natures of operation, such as wave glide
 //Why should we do RTQC?
 There are a wide range of applications that cannot afford waitting for the delayed mode product due to time constraints, such as data assimilation for weather and sea state forecasts, thus requiring a real-time quality control with low latency. Given the diverse environments where gliders are operated and differences among glider models, the operators themselves are the best suited to evaluate their own measurements. Despite that, different users might have different priorities as well as tolerance for what they consider as useful data. While for data assimilation one might unforgive bad measurements and rather use less data with higher quality, a monitoring or alerting system cannot afford wrongfully flaggin and miss a single extreme event. Although there is no one unique optimal flagging for everyone, communication with the final users and understanding of the expectations can help fine tunning the QC criteria and desired informative flags. It should be expected that some users will apply their own QC procedure in addition to the QC from the operators, but all it takes is one obviously bad data point to damage the credibility of the whole dataset. Even before the final users, the glider operation itlsef benefits from a continuous real-time QC. A prompt detection of sensor degradation might trigger an early recovery and swap sensors or the full platform when possible. Catastrophic failures are sometimes preceded by anomalous behaviour, thus high rate of errors should raise the alertness of the pilots. Some spurious measurements are inevitable for any ocean observing system. Despite the constraints imposed by telemetry and pressure for low latency, real-time QC has clear benefits and should be part of all glider operation.

+// What defines real-time?


As 'real-time' or 'near real time' for gliders, operators and scientists may have a subset of all the observations on land during the glider deployment due to satellite communication. The observation availability depends on the surface time (shipping traffic, fishing activities) and the reflecting cost of satellite communication. In real-time, we should always expect limited observation. Despite that fact, the decimate observations should offer us enough information regarding data quality and the study area's different physical and biogeochemical processes on a horizontal and vertical scale. The particular issue of real-time QC is to evaluate the most recent data point that can be ingested into models or valuable for the stakeholders and policymakers. The real-time observation is acquired typically in seconds to hours and can be instantly used for model ingestion. This is often the case for operational modeling communities that require data within 24 hours, require automated procedures, and are aware that the data has not been subjected to climate-grade QC.

Defining what is real-time

56180fc

It is important to clearly define what this document covers in the 'life' of a glider operation

castelao self-assigned this Feb 26, 2023

Initiating a list of points to guide writing

4dcdfc5

nizaroka reviewed Jun 23, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Defining what is real-time #8

Defining what is real-time #8

castelao commented Feb 26, 2023

nizaroka Jun 23, 2023

Defining what is real-time #8

Are you sure you want to change the base?

Defining what is real-time #8

Conversation

castelao commented Feb 26, 2023

Type of PR

Related Issues

Dates when it got review approvals

Release checklist

For maintainers

Comments

nizaroka Jun 23, 2023

Choose a reason for hiding this comment