Skip to content

BloggerBust/lexicomb

Repository files navigation

Table of Contents

  1. Introduction
  2. Language Specification
    1. W3C EBNF (Extended Backus-Naur Form) Reference
    2. Tag Stream
    3. Tag Statements
      1. EBNF production rules
    4. Lexicon Script
      1. EBNF production rules
      2. Examples
    5. Lexicomb Engine
      1. Configuration
      2. Concurrency
  3. How to install it
  4. Example Usage
  5. How to contribute
    1. How to setup a developer environment
    2. Where to do your work
    3. Don't forget unit & integration tests
    4. Making commits
    5. Making a pull request
  6. Related Projects
    1. BbPyP
  7. License

Introduction

Lexicomb is a keyword-driven interpreted programming language. The word Lexicomb is the contraction of the word lexical, meaning content word, and combinator, meaning that which combines. The Lexicomb interpreter is composed of a lexical analyzer and a parser combinator.

Language Specification

Lexicomb source code has two representations: Tag Stream and Lexicon script. Lexicon script is kept in a file with the extension ls and saved in a directory that may contain many such files. All such files, taken together, constitute the lexicon. The name of the file becomes the first tag of the tag statement, therefore file names may not contain spaces.

W3C EBNF (Extended Backus-Naur Form) Reference

I have adopted the use of the W3C standard notation for EBNF. Initially, I was using the ISO/IEC 14977 EBNF standard as described by Wikipedia's Extended Backus-Naur form page, but I found the W3C standard notation to be more compact thanks to its use of bracket expressions. Having said that, it is my opinion that Wikipedia did a better job of explaining the ISO/IEC 14977 notation. I particularly liked that the Wikipedia page organized reserved syntax in a table of symbols for quick reference.

Tag Stream

A Tag stream is a list of Tag Statements delimited by line endings.

Tag Statements

Tag statements are intended to take on the imperative mood with least verbosity. That is, they should have the form of a terse instruction. Here are some examples:

Register John
Register Sally

Exercise John situps 45 07:05 07:12
Exercise Sally pushups 25 07:05 07:10

If a single word is not enough to describe the tag, then multiple words may be combined in Pascal case.

CreateString Combine each tag argument with a single space and return the result

EBNF production rules

tag_statement ::= tag space argument_list
argument ::= ( char | digit )+ | real
argument_list ::= argument_list space argument | argument ( delimiter argument )*
tag ::=  char ( char | digit )*
real ::= digit+ ( '.' digit+ )?
char ::= [_a-zA-Z]
digit ::= [0-9]
space ::= [ ] =/*a single white space/*=
delimiter ::= [:]

Lexicon Script

Lexicon script has a simple grammar. The only literal is of type real and includes both int and float types. Neither String nor Boolean types can be represented literally, but both may be created indirectly. Hash statements must be declared explicitly, but have no generalized literal representation. In short, the language directly supports basic arithmetic and logical expressions. Control flow is achieved via conditional_statement, conditional_repeat and tag_statement.

EBNF production rules

/* The following identifiers are inherited from the EBNF production rules for Tag Statements:
   - tag_statement
   - char
   - digit
   - real
*/

block ::= { statement_list? }

/* compound statement */
statement_list ::= statement_list ';' statement | statement

conditional_statement ::= '?' logical_expression block conditional_statement | '?' logical_expression block block | '?' logical_expression block

/* while any logical_expression in the conditional_statement tree is truthy, repeat the evaluation of the conditional_statement */
conditional_repeat ::= '@' conditional_statement 

statement ::= name ':=' expression ';'
    | name ':=' hash ';'
    | tag_statement ';' /* see EBNF production rules for tag statements */
    | conditional_statement
    | conditional_repeat

logical_expression ::= logical_expression '&&' relational_expression
    | logical_expression '||' relational_expression
    | '!' logical_expression | relational_expression

relational_expression ::= arithmetic_expression '<' arithmetic_expression
    | arithmetic_expression '<=' arithmetic_expression
    | arithmetic_expression '=' arithmetic_expression
    | arithmetic_expression '>' arithmetic_expression
    | arithmetic_expression '>=' arithmetic_expression

arithmetic_expression ::= expression '+' term
    | expression '-' term
    | term

expression ::= logical_expression | arithmetic_expression

term ::= term '*' factor
    | term '/' factor
    | factor

factor ::= name | real | expression

accessor ::= '[' name ']' | '[' real ']'
access ::= name accessor*
existence ::= '[' access ']' =/* truthy if access is successful, falsy otherwise /*=

hash ::= '{' '}'
name ::= char ( char | digit )*

Examples

  1. Accessing arguments

    Arguments that are passed to a tag are named arg followed immediately by their 0 based positional value. They may also be collectively accessed via the args name. For example, if MyTag is called with three values.

    MyTag First second 3
    

    Then from within the tag definition those arguments may be accessed by their name as follows:

    {
      first_argument := arg0;
      second_argument := arg1;
      third_argument := arg2;
    }
    

    Or they may be accessed using the args name:

    {
      first_argument := args[0];
      second_argument := args[1];
      third_argument := args[2];
    }
    

    The access operator is safe to use at an arbitrary depth, without having to perform existence checks at each depth.

    {
        has_arg0 := [arg0];
        has_arg1 := [arg1];
        is_arg1_undefined := ![arg1];
        does_arg0_have_deep_property := [arg0[property][deep_property]];
    }
    
  2. Create negative value

    There are no unary operators specified in the EBNF. That does not limit us from creating negative numbers, or from changing the sign of a numeric value.

    ChangeSign.ls:

    {
      return 0 - arg0;
    }
    
    {
      x:= ChangeSign 5;
      y:= ChangeSign x;
      return CreateString first is x and second is y;
    }
    
  3. Create String

    Strings can be created by concatenating one or more name and real types.

    CreateString.ls:

    {
      count := 0;
      blank := ReturnNothing _;
      @?[args[count]]{
        ?[words]{
          words := words + blank + args[count];
        }
        {
          words := args[count];
        }
        count := count + 1;
      };
      return words;
    }
    

    The CreateString tag may be used to create a string of at least length one with a single space separating each word.

    {
      my_string := CreateString This is one way to create a string 1 2 3 4;
      my_string_with_a_single_leading_number := CreateString 1 2 3 leading numbers will be summed;
    }
    
  4. Create Boolean

    Logical expressions resolve to a True or False value. If that value is assigned to a name, then the result is a named boolean value that can be used in control-flow or returned as a result

    CreateBoolean.ls:

    {
      t := CreateString true;
      return arg0 = t;
    }
    
    {
      true := CreateBoolean true;
      false := CreateBoolean false;
      amount := arg0;
    
      ? arg1 = true {
        amount := ChangeSign amount;
      };
      ? amount <= 0 {
        return false;
      }
      # do stuff...
      return true;
    }
    
  5. Create hash

    A new and empty hash value can be assigned to a name.

    {
      my_hash := {};
    }
    

    However, as seen in the EBNF, it is not possible to initialize a hash value with a set of key value pairs. A CreateHash tag can be used to encapsulate hash initialization.

    CreateHash.ls:

    {
      hash := {};
      key_index := 0;
      value_index := 1;
    
      key := ReturnNothing _;
      value := ReturnNothing _;
    
      @?[args[key_index]] && [args[value_index]] {
        key := args[key_index];
        hash[key] := args[value_index];
        key_index := value_index + 1;
        value_index := key_index + 1;
      };
    
      return hash;
    }
    
    return CreateHash first_name:John last_name:Doe age:99;
    

Lexicomb Engine

Configuration

Each bbpyp namespace has a Dependency Injector IoC container that accepts a python dictionary named config.

  1. Logging

    config Key: logger

    Is Optional: True

    The logger configuration key may be set to any valid logging dictionary configuration. This parameter is entirely optional.

    There are two named loggers that can be configured:

    1. bbpyp.lexicomb
    2. bbpyp.lexicomb_engine

Concurrency

The core of the Lexicomb Engine is the LexicombPubSubClient. The LexicombPubSubClient relays messages between Lexicomb's LexicalStateMachine and InterpreterStateMachine using TopicChannels. The number of concurrent publish and subscribe connections opened per TopicChannel is configurable using the Message Bus memory_channel_topic configuration option.

There are four topic names:

  1. bbpyp.lexical_state_machine.lexical_analyse
  2. bbpyp.interpreter_state_machine.parse
  3. bbpyp.interpreter_state_machine.evaluate
  4. bbpyp.interpreter_state_machine.report

How to install it

To do…

Example Usage

To do…

How to contribute

I am happy to accept pull requests. If you need to get a hold of me you can create an issue or email me directly.

How to setup a developer environment

First, fork this repository and clone your fork to a local dev environment.

git clone https://github.com/<your-username>/lexicomb.git

Next, create a venv and install the latest pip and setuptools.

cd lexicomb
python -m venv venv
source venv/bin/activate
pip install -q --upgrade pip setuptools

Lastly, install the dev requirements declared in dev-requirements.txt and run the unit tests.

pip install -q -r dev-requirements.txt
python -m unittest discover

......................................................
----------------------------------------------------------------------
Ran 54 tests in 0.716s

OK

Where to do your work

Keep your mainline up to date with upstream.

git fetch origin --prune
git checkout master
git --ff-only origin/master

Make your changes in a feature branch.

git checkout -b branch_name

Don't forget unit & integration tests

Unit and integration tests are written using python's unittest framework. The unittests use the mock library. Please do write both unit tests and integration tests to accommodate your contribution, except where existing tests are sufficient to cover the change.

Making commits

Read Chris Beams excellent article on writing commit messages and do your best to follow his advice.

Making a pull request

If you feel that your changes would be appreciated upstream, then it is time to create a pull request. Please write tests to validate your code changes and run all the tests again before making a pull request to defend against inadvertently braking something.

python -m unittest discover

If you have made many intermittent commits in your feature branch, then please make a squash branch and rebase with a single squashed commit. A squash branch is just a spin-off branch where you can perform the squash and rebase without the fear of corrupting your feature branch. My preference is to perform an interactive rebase. Note, that a squash branch is pointless if you only made a single commit.

First switch to master and fast forward to HEAD. This will reduce the risk of having a merge conflict later.

git checkout master
git fetch origin --prune
git merge --ff-only origin/master

Next, switch back to your feature branch and pull any changes fetched to master. If there are conflicts, then resolve them. Be sure to run all the tests once more if you had to merge with changes from upstream.

git checkout branch_name
git pull origin/master
python -m unittest discover

Determine the first commit of the feature branch which will be needed during interactive rebasing.

git log master..branch_name | grep -iE '^commit' | tail -n 1

commit f723dcc2c154662b3d6c366fb5ad923865687796

Then, create a squash branch as a spin-off of the feature branch and begin the interactive rebase following this guidance.

git checkout -b branch_name_squash
git rebase -i f723dcc^

Now, if you make a mistake during the rebase, but don't notice until after you have already committed, all of your precious commit history remains in the feature branch. Simply reset the squash branch back to the feature branch and start again. Once you are happy with your rebase, push the squash branch to remote and create a pull request.

Related Projects

BbPyP (Blogger Bust Python Project) is a collection of python packages that I intend to use to help develop other more interesting python projects.

License

Apache License v2.0

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published