Skip to content

Commit

Permalink
Merge pull request #57 from NREL/remove_single_time_series_aggregate
Browse files Browse the repository at this point in the history
removed aggregate single time series feature
  • Loading branch information
KapilDuwadi committed Nov 20, 2024
0 parents commit f90ceab
Show file tree
Hide file tree
Showing 78 changed files with 15,475 additions and 0 deletions.
4 changes: 4 additions & 0 deletions .buildinfo
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# Sphinx build info version 1
# This file records the configuration used when building these files. When it is not found, a full rebuild will be done.
config: 2f4806f61ff250f13d540156aeb767b0
tags: 645f666f9bcd5a90fca523b33c5a78b7
Empty file added .nojekyll
Empty file.
45 changes: 45 additions & 0 deletions _sources/explanation/components.md.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
```{eval-rst}
.. _components-page:
```
# Components
A component is any element that is attached to a system.

All components are required to define a name as a string (it is required in the base class). This
may not be appropriate for all classes. The `Location` class in this package is one example. In
cases like that developers can define their own name field and set its default value to `""`.

Refer to the [Components API](#components-api) for more information.

## Inheritance
Recommended rule: A `Component` that has subclasses should never be directly instantiated.

Consider a scenario where a developer defines a `Load` class and then later decides a new load is
needed because of one custom field.

The temptation may be to create `CustomLoad(Load)`. This is very problematic in the design of
the infrasys API. There will be no way to retrieve only `Load` instances. Consider this example:

```python
for load in system.get_components(Load)
print(load.name)
```

This will retrieve both `Load` and `CustomLoad` instances.

Instead, our recommendation is to create a base class with the common fields.

```python
class LoadBase(Component)
"""Defines common fields for all Loads."""

common_field1: float
common_field2: float

class Load(LoadBase):
"""A load component"""

class CustomLoad(LoadBase):
"""A custom load component"""

custom_field: float
```
16 changes: 16 additions & 0 deletions _sources/explanation/index.md.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
```{eval-rst}
.. _explanation-page:
```
# Explanation

```{eval-rst}
.. toctree::
:maxdepth: 2
:caption: Contents:

system
components
time_series
location
serialization
```
4 changes: 4 additions & 0 deletions _sources/explanation/location.md.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# Location
Components can compose this class in order to specify its geographic location.

Refer to the [Location API](#location-api) for more information.
122 changes: 122 additions & 0 deletions _sources/explanation/serialization.md.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,122 @@
# Serialization
This page describes how `infrasys` serializes a system and its components to JSON when a user calls
`System.to_json()` and `System.from_json()`.

## Components
`infrasys` converts its nested dictionaries of components-by-type into a flat array. Each component
records metadata about its actual Python type into a field called `__metadata__`. Here is an example
of a serialized `Location` object. Note that it includes the module and type. `infrasys` uses this
information during de-serialization to dynamically import the type and construct it. This allows
serialization to work with types defined outside of `infrasys` as long as the user has imported
those types.

```json
{
"uuid": "1e5f90ae-a386-4c8a-89ae-0ed123da3e26",
"name": null,
"x": 0.0,
"y": 0.0,
"crs": null,
"__metadata__": {
"fields": {
"module": "infrasys.location",
"type": "Location",
"serialized_type": "base"
}
}
},
```

### Composed components
There are many cases where one component will contain an instance of another component. For example,
a `Bus` may contain a `Location` or a `Generator` may contain a `Bus`. When serializing each
component, `infrasys` checks the type of each of that component's fields. If a value is another
component (which means that it must also be attached to system), `infrasys` replaces that instance
with its UUID. It does this to avoid duplicating data in the JSON file.

Here is an example of a serialized `Bus`. Note the value for the `coordinates` field. It contains the
type and UUID of the actual `coordinates`. During de-serialization, `infrasys` will detect this
condition and only attempt to de-serialize the bus once all `Location` instances have been
de-serialized.

```json
{
"uuid": "e503984a-3285-43b6-84c2-805eb3889210",
"name": "bus1",
"voltage": 1.1,
"coordinates": {
"__metadata__": {
"fields": {
"module": "infrasys.location",
"type": "Location",
"serialized_type": "composed_component",
"uuid": "1e5f90ae-a386-4c8a-89ae-0ed123da3e26"
}
}
},
"__type_metadata__": {
"fields": {
"module": "tests.models.simple_system",
"type": "SimpleBus",
"serialized_type": "base"
}
}
},
```

#### Denormalized component data
There are cases where users may prefer to have the full, denormalized JSON data for a component.
All components are of type `pydantic.BaseModel` and so implement the method `model_dump_json`.

Here is an example of a bus serialized that way (`bus.model_dump_json(indent=2)`):

```json
{
"uuid": "e503984a-3285-43b6-84c2-805eb3889210",
"name": "bus1",
"voltage": 1.1,
"coordinates": {
"uuid": "1e5f90ae-a386-4c8a-89ae-0ed123da3e26",
"name": null,
"x": 0.0,
"y": 0.0,
"crs": null
}
}
```

### Pint Quantities
`infrasys` encodes metadata into component JSON when that component contains a `pint.Quantity`
instance. Here is an example of such a component:

```json
{
"uuid": "711d2724-5814-4e0e-be5f-4b0b825b7f07",
"name": "test",
"distance": {
"value": 2,
"units": "meter",
"__metadata__": {
"fields": {
"module": "infrasys.quantities",
"type": "Distance",
"serialized_type": "quantity"
}
}
},
"__metadata__": {
"fields": {
"module": "tests.test_serialization",
"type": "ComponentWithPintQuantity",
"serialized_type": "base"
}
}
}
```

## Time Series
If the user stores time series data in Arrow files (default behavior), then `infrasys` will copy
the Arrow files into the user-specified directory in `system.to_json()`.

If the user instead chose to store time series in memory then `infrasys` will series that data
into Arrow files in the user-specified directory in `system.to_json()`.
98 changes: 98 additions & 0 deletions _sources/explanation/system.md.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
# System
The System class provides a data store for components and time series data.

Refer to the [System API](#system-api) for complete information.

## Items to consider for parent packages

### Composition vs Inheritance
Parent packages must choose one of the following:

1. Derive a custom System class that inherits from `infrasys.System`. Re-implement methods
as desired. Add custom attributes to the System that will be serialized to JSON.

- Reimplement `System.add_components` in order to perform custom validation or custom behavior.
This is only needed for validation that needs information from both the system and the
component. Note that the `System` constructor provides the keyword argument
`auto_add_composed_components` that dictates how to handle the condition where a component
contains another component which is not already attached to the system.

- Reimplement `System.serialize_system_attributes` and `System.deserialize_system_attributes`.
`infrasys` will call those methods during `to_json` and `from_json` and serialize/de-serialize
the contents.

- Reimplement `System.data_format_version` and `System.handle_data_format_upgrade`. `infrasys`
will call the upgrade function if it detects a version change during de-serialization.

2. Implement an independent System class and compose the `infrasys.System`. This can be beneficial
if you want to make the underlying system opaque to users.

- This pattern requires that you call `System.to_json()` with the keyword argument `data` set
to a dictionary containing your system's attributes. `infrasys` will add its contents to a
field called `system` inside that dictionary.

3. Use `infrasys.System` directly. This is probably not what most packages want because they will
not be able to serialize custom attributes or implement specialized behavior as discussed above.

### Units
`infrasys` uses the [pint library](https://pint.readthedocs.io/en/stable/) to help manage units.
Package developers should consider storing fields that are quantities as subtypes of
[Base.Quantity](#base-quantity-api). Pint performs unit conversion automatically when performing
arithmetic.

If you want to be able to generate JSON schema for a model that contains a Pint quantity, you must
add an annotation as shown below. Otherwise, Pydantic will raise an exception.

```python
from pydantic import WithJsonSchema
from infrasys import Component

class ComponentWithPintQuantity(Component):

distance: Annotated[Distance, WithJsonSchema({"type": "string"})]

Component.model_json_schema()
```

**Notes**:
- `infrasys` includes some basic quantities in [infrasys.quantities](#quantity-api).
- Pint will automatically convert a list or list of lists of values into a `numpy.ndarray`.
infrasys will handle serialization/de-serialization of these types.


### Component Associations
The system tracks associations between components in order to optimize lookups.

For example, suppose a Generator class has a field for a Bus. It is trivial to find a generator's
bus. However, if you need to find all generators connected to specific bus, you would have to
traverse all generators in the system and check their bus values.

Every time you add a component to a system, `infrasys` inspects the component type for composed
components. It checks for directly connected components, such as `Generator.bus`, and lists of
components. (It does not inspect other composite data structures like dictionaries.)

`infrasys` stores these component associations in a SQLite table and so lookups are fast.

Here is how to complete this example:

```python
generators = system.list_parent_components(bus)
```

If you only want to find specific types, you can pass that type as well.
```python
generators = system.list_parent_components(bus, component_type=Generator)
```

**Warning**: There is one potentially problematic case.

Suppose that you have a system with generators and buses and then reassign the buses, as in
```
gen1.bus = other_bus
```

`infrasys` cannot detect such reassignments and so the component associations will be incorrect.
You must inform `infrasys` to rebuild its internal table.
```
system.rebuild_component_associations()
```
48 changes: 48 additions & 0 deletions _sources/explanation/time_series.md.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# Time Series
Infrastructure systems supports time series data expressed as a one-dimensional array of floats
using the class [SingleTimeSeries](#singe-time-series-api). Users must provide a `variable_name`
that is typically the field of a component being modeled. For example, if the user has a time array
associated with the active power of a generator, they would assign
`variable_name = "active_power"`.

Here is an example of how to create an instance of `SingleTimeSeries`:

```python
import random
time_series = SingleTimeSeries.from_array(
data=[random.random() for x in range(24)],
variable_name="active_power",
initial_time=datetime(year=2030, month=1, day=1),
resolution=timedelta(hours=1),
)
```

Users can attach their own attributes to each time array. For example,
there might be different profiles for different scenarios or model years.

```python
time_series = SingleTimeSeries.from_array(
data=[random.random() for x in range(24)],
variable_name="active_power",
initial_time=datetime(year=2030, month=1, day=1),
resolution=timedelta(hours=1),
scenario="high",
model_year="2035",
)
```

## Behaviors
Users can customize time series behavior with these flags passed to the `System` constructor:

- `time_series_in_memory`: The `System` stores each array of data in an Arrow file by default. This
is a binary file that enables efficient storage and row access. Set this flag to store the data in
memory instead.
- `time_series_read_only`: The default behavior allows users to add and remove time series data.
Set this flag to disable mutation. That can be useful if you are de-serializing a system, won't be
changing it, and want to avoid copying the data.
- `time_series_directory`: The `System` stores time series data on the computer's tmp filesystem by
default. This filesystem may be of limited size. If your data will exceed that limit, such as what
is likely to happen on an HPC compute node, set this parameter to an alternate location (such as
`/tmp/scratch` on NREL's HPC systems).

Refer to the [Time Series API](#time-series-api) for more information.
12 changes: 12 additions & 0 deletions _sources/how_tos/index.md.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
```{eval-rst}
.. _how-tos-page:
```
# How Tos

```{eval-rst}
.. toctree::
:maxdepth: 2
:caption: Contents:

list_time_series
```
Loading

0 comments on commit f90ceab

Please sign in to comment.