Skip to content

Latest commit

 

History

History
260 lines (188 loc) · 11.9 KB

README.md

File metadata and controls

260 lines (188 loc) · 11.9 KB

Hial

Hial is a general purpose data API library and CLI tool. It is a programmatic CRUD-like interface to various types of data represented as uniform tree structures. Hial proposes a relatively simple mental model, suitable for most use cases, which eliminates the accidental complexity of having to handle separate APIs for different data types. The tree model makes the data easy to read, explore and modify using a small number of functions.

The types of data that can be supported by this API are the file system, configuration files (json, yaml, toml), markup files (xml, html), programs written in various programming languages, operating system configurations and runtime parameters, database tables and records, http requests, etc.

Central to the Hial model is the idea of interpretation. Any piece of data has an implicit interpretation (i.e. "what the data means") which is sometimes difficult to grasp from context. In Hial the data is structured hierarchicaly, in a tree, each piece being connected to other pieces of data, and they all together form an interpretation of some underlying raw data. For example, a json data tree can be an interpretation of some raw bytes from the network, or a python AST data tree can be an interpretation of a file content. Any piece of data can be reinterpreted as something else (whenever it makes sense) which allows the user to operate on data at the right semantic level for the task at hand.

The tree data model accepts a path API similar to xpath, json path, file system path, or other similar path languages. It is a concise way to express data searches common in programming and system administration.

⚠️ Hial is currently under construction. Some things don't work yet and some things will change.

What can it do?

1. Search for pieces of data in a structured way.

Print a value embedded in a json file or from a url that returns json or xml data:

hial './examples/productiondump.json^json/stacks/*/services'
hial 'http://api.github.com^http^json/rate_limit_url^http^json/resources/core'
hial 'http://www.phonetik.uni-muenchen.de/cgi-bin/BASRepository/oaipmh/oai.pl^http^xml'

Print all services with inaccessible images in a Docker compose file:

# shell
hial './config.yaml^yaml/services/*[ /image^split[":"]/[0]^http[HEAD]@status/code>=400 ]'
# 🚧 todo: split interpretation (regex[( ([^:]*): )*]
# 🚧 todo: get from docker image to docker url
// rust (native)
for service in Cell::new("./config.yaml").all("^yaml/services") {
    let image = service.to("/image");
    if image.to("^http[HEAD]@status/code") >= 400 {
        println!("service {} has an invalid image: {}",
            service.read().value()?,
            image.read().value()?
        );
    }
}

Print the structure of a rust file (struct, enum, type, functions) as a tree:

hial './src/tests/rust.rs^rust/**[#type^split["_"]/[-1]=="item"]/*[name|parameters|return_type]'
# 🚧 todo: search results as tree
# 🚧 todo: split interpretation
2. Modify data selected as above.

Change the default mysql port systemwide:

# shell
hial '/etc/mysql/my.cnf^fs[w]^ini/mysqld/port = 3307'
// rust
Cell::new("/etc/mysql/my.cnf^fs[w]^ini/mysqld/port")
    .write()
    .value(3307)?;

Change the user's docker configuration:

# shell
hial '~/.docker/config.json^fs[w]^json/auths/docker.io/username = "newuser"'
# 🚧 todo: create new object entities on write
// rust
Cell::new("~/.docker/config.json^fs[w]^json/auths/docker.io/username")
    .write()
    .value("newuser")?;
3. Copy pieces of data from one place to another.

Copy a string from some json object entry which is embedded in a zip file, into a rust string:

# shell
hial 'copy ./assets.zip^zip/data.json^json/meshes/sphere  ./src/assets/sphere.rs^rust/**[:let_declaration][/pattern=sphere]/value'
# 🚧 todo: support copy
# 🚧 todo: support zip
# 🚧 todo: /**[filter] should match leaves only

Split a markdown file into sections and put each in a separate file:

# shell
`hial 'copy  ./book.md^md/*[:heading1][as x]  ./{label(x)}.md'
# 🚧 todo: support copy
# 🚧 todo: support markdown
# 🚧 todo: support interpolation in destination
4. Transform data from one format or shape into another.

Transform a json file into an xml file with the same format and vice versa:

hial 'copy  file.json^json^tree^xml  ./file.xml'
hial 'copy  file.xml^xml^tree^json  ./file.json'
# 🚧 todo: support copy
# 🚧 todo: support tree implementation and conversion
5. Structured diffs

Compare two files in different formats and print the resulting diff tree:

hial 'diff  ./file.json^json^tree  ./file.xml^xml^tree'
# 🚧 todo: support diff
# 🚧 todo: support tree implementation and conversion

Diff the files listed in a rust mod file against the actual list of files

hial 'diff
    ./src/tests/mod.rs^rust/**/*[:mod_item]/name#value
    ./src/tests/*[:file]#label^regex["([^.]+).*"]/*/[0]'
# 🚧 todo: support diff

Diff two diff trees (e.g. check if two different commits make identical changes)

hial 'x = diff .^git/HEAD^fs .^git/HEAD~1^fs ;
      y = diff .^git/branch1^fs .^git/branch1~1^fs ;
      diff $x $y
     '
# 🚧 todo: support diff
# 🚧 todo: support git interpretation

Installation and usage

To test the examples or use the library from a shell, build the project: cargo build --release. Then run the hial command, e.g.: hial 'http://api.github.com^http^json'

The data model

The data model is that of a tree of simple data nodes. The tree has a root node and a hierarchy of children nodes.

Each data node is called a cell. It may have a value (a simple data type like string, number, bool or blob). A cell is part of a group of cells. The cell may have an index (a number) or a label (usually a string) to identify it in this group.

A cell may have subordinate cells (children in the tree structure) which are organized into a group. We call this the sub group. A cell may also have attributes or properties which also cells and are put into the attr group. The children cells have the first cell as their parent.

A cell is always an interpretation of some underlying data. For example a series of bytes 7b 22 61 22 3a 31 7d can have multiple interpretations:

  1. a simple byte array which is represented by a single cell with the data as a blob value:
Cell: value = Blob([7b 22 61 22 3a 31 7d]),
  1. a string of utf-8 encoded characters which is represented by a single cell with the data as a string value:
Cell: value = String("{\"a\":1}"),
  1. a json object which is represented by a tree of cells, the root cell being the json object {} with a sub cell with label a and value 1:
Cell:
    type: "object",
    sub:
        Cell:
            label: "a",
            value: 1,
            type: "number",

Usually a piece of data has a humanly obvious best interpretation (e.g. json in the previous example), but the data can be always explicitly reinterpreted differently.

A cell also has a string type describing its kind, depending on the interpretation. Such types can be: "file" or "folder" (in the fs interpretation), "array" (in the json interpretation), "function_item" (in the rust interpretation), "response" (in the http interpretation), etc.

---
title: Data model diagram
---
erDiagram
    Cell 1--o| "Sub Group" : "sub()"
    Cell 1--o| "Attr Group" : "attr()"
    Cell {
          int index
          value label
          value value
          string type
    }
    "Sub Group" 1--0+ "Cell" : "at(), get()"
    "Attr Group" 1--0+ "Cell" : "at(), get()"
Loading

Examples:

  • A folder of the file system is a cell. It has a sub group and may have sub cells (files or folders which it contains); it may also have a parent cell (parent folder). Its attr items are creation/modification date, access rights, size, etc. The folder name is the label and has no value.

  • A file of the file system is a cell. It has no sub items, may have one parent, has the same attr as a folder and the label as its name. A file cell can be interpreted in many other ways (string cell, json/yaml/xml cell tree, programming cell trees).

  • An entry into a json object is a cell. The json key in the key/value pair is the cell label. If the value of this json object entry is null or bool or number, then the cell will have a corresponding value and no sub; if it's an array or object then the cell will have a sub group with the content of the array or object.

  • A method in a java project is a cell. It has a parent class, access attributes (attr), and arguments, return type and method body as children (sub).

  • An http call response is a cell. It has status code and headers as attr and the returned body data as its value (a blob). It is usually further interpreted as either a string or json or xml etc.

Path language

This unified data model naturally supports a path language similar to a file system path, xpath or json path. A cell is always used as a starting point (e.g. the file system current folder). The / symbol designates moving to the sub group; the @ symbol to the attr group. Jumping to a different interpretation is done using the ^ (elevate) symbol.

As a special case, the starting point of a path is allowed to be a valid url (starting with http:// or https://) or a file system path (which must be either absolute, starting with /, or relative, starting with .).

Other special operators are the * operator which selects any cell in the current group and the ** operator which selects any cell in current group and any cell descendants in the current interpretation. Filtering these cells is done by boolean expressions in brackets.

Examples:

  • .^fs is the current folder ("." in the file system interpretation). It is equivalent to just ..

  • ./src/main.rs is the main.rs file in the ./src/ folder.

  • ./src/main.rs@size is the size of this file (the size attribute of the file).

  • ./src/main.rs^rust represents the rust AST tree.

  • ./src/main.rs^rust/*[:function_item] are all the top-level cells representing functions in the main.rs rust file.

  • http://api.github.com is a url cell.

  • http://api.github.com^http is the http response of a GET request to this url.

  • http://api.github.com^http^json is the json tree interpretation of a http response of a GET request to this url.

  • http://api.github.com^http^json/rate_limit_url^http^json/resources/core/remaining makes one http call and uses a field in the respose to make another http call, then selects a subfield in the returning json.

  • ./src/**^rust returns a list of all rust files (all files that have a rust interpretation) descending from the src folder.

  • ./src/**^rust/**[#type=="function_item"] lists all rust functions in all rust files in the src folder.

  • ./src/**^rust/**[#type=="function_item"]/**[#type=="let_declaration"] lists all occurences of let declarations in all functions in all rust files in the src folder.

  • ./src/**^rust/**[#type=="function_item"]/**[#type=="let_declaration"][/pattern/*] lists only destructuring patterns in all occurences of let declarations in all functions in all rust files in the src folder. The destructuring patterns are the only ones that have a descendant of pattern, for which the filter [/pattern/*] is true.

What's the current project status?

See status.md and issues.md.

The implementation language is Rust, and a Rust API is natively available.

As a command line tool hial can be used from any language that can call shell commands.

C, Python, Java, Go, Javascript wrappers are planned.