Saving memory & attribute composition #3065

brdvd · 2021-12-17T12:50:49Z

brdvd
Dec 17, 2021
Collaborator

Saving memory
Verovio is getting quite memory hungry. The following table shows memory peaks for load+layout+rendering:

MEI size	Memory peak	Ratio
226 KB	21,9 MB	96,9
1,1 MB	54,1 MB	49,2

A lot of the memory is consumed by the object tree, so let's have a look at some object sizes:

Class	Size in Bytes
Note	1768
LayerElement	816
LinkingInterface	296

This is just the fixed object size (i.e the result of the sizeof operator), the actual memory consumed by these objects will be larger due to container members like std::string or std::vector. But it is already interesting to see that a sixth of each Note object consists of the linking interface which sits inside each LayerElement and ControlElement. And is hardly ever used.

In Verovio we have this unwritten law that we always store attributes by inheritance. So instead of storing attributes which are actually encoded, we store all attributes that could be used (from a rendering point of view). This has some neat advantages:

Attribute getter and setter are automatically available.
Availability of attributes can be checked via dynamic_cast

But it also has the disadvantage that it unnecessarily eats up memory.

So the principal idea would be to allow for attributes also to be stored via composition instead of inheritance. Imagine that we rename the current LinkingInterface into LinkingInterfaceImpl and create a new LinkingInterface class which stores a pointer to LinkingInterfaceImpl. This pointer would be NULL as long as no linking is encoded, reducing the size of LinkingInterface to 8 bytes. So this alone would save almost 300 bytes on most of the layer and control elements. There still is a catch: how would an element gain access to an attribute stored inside LinkingInterfaceImpl? dynamic_cast would not work anymore. Even worse, the whole attribute detection mechanism that is currently used in Verovio and LibMEI would fail.

Current attribute detection
During element construction we currently register all attribute and interface ids and store them in a std::vector in Object (see m_attClasses and m_interfaces). We then have functions bool Object::HasAttClass(AttClassId attClassId) and bool HasInterface(InterfaceId interfaceId) which check the existence of an attribute or interface by searching the id in the corresponding vector. Finally, we typically use the following pattern to retrieve an attribute if it exists:

if (element->HasAttClass(ATT_NAME)) {
        AttClass *att = dynamic_cast<AttClass *>(element);
        assert(att);

Detection with attribute composition
The idea is to have a function which collects all attributes stored by composition and not inheritance:

void Element::CollectCompositeAttributes(ListOfObjects &objects, ListOfInterfaces &interfaces, ListOfAttributes &attributes)
{
       // 1. Call this function for each object and interface base class
       InterfaceBase::CollectCompositeAttributes(objects, interfaces, attributes);
       ...
       // 2. Register any objects, interfaces or attributes which are stored by composition and call the function on them
       if (m_interfacePointer) {
            m_interfacePointer->CollectCompositeAttributes(objects, interfaces, attributes);
            interfaces->push_back(m_interfacePointer);
       }
}

With this we could change the attribute detection into

template <typename T>
T* Object::GetAttribute()
{
     // 1. Check for direct inheritance
     T* retVal = dynamic_cast<T *>(this);
     if (retVal) return retVal;

     // 2. Collect all attributes stored by composition
     ListOfObjects objects;
     ListOfInterfaces interfaces;
     ListOfAttributes attributes;
     this->CollectCompositeAttributes(objects, interfaces, attributes);
    
     // 3. Try casting each collected object, interface and attribute
     for (Object *object: ListOfObjects) {
            retVal = dynamic_cast<T *>(this);
            if (retVal) return retVal;
     }
     ...

     // 4. Return NULL if no cast succeeded
     return NULL;
}

On client side we could check for attribute existence like this:

AttClass *att = element->GetAttribute<AttClass>();
if (att) {
      ...

Note that CollectCompositeAttributes must be implemented in each object and interface subclass. While this will be the biggest change, it is similar to now where we must register attributes during construction in each subclass. Also we would get rid of the attribute and interface id vectors which will save additional memory.

Open questions

Is there a consensus that reducing memory in Verovio is desired?
Are there any advantages of attribute inheritance that I have missed out?
Are there any issues with the suggested approach to attribute detection? Please feel free to suggest alternative approaches.

craigsapp · 2021-12-17T13:11:39Z

craigsapp
Dec 17, 2021
Collaborator

Is there a consensus that reducing memory in Verovio is desired?

I think it is a good idea, but priority should be for speed whenever there is a tradeoff between the two. The advantage to lower memory use would be to allow for longer works, such as multiple movements of a single work, or other possibilities that lower memory usage would allow (such as a separate instances of verovio for each movement in the javascript toolkit).

0 replies

brdvd · 2021-12-17T14:58:57Z

brdvd
Dec 17, 2021
Collaborator Author

I totally agree, speed should always have higher priority than memory. I think the memory usage only becomes an issue when rendering larger files in confined environments (like the browser) or when running several instances of Verovio in parallel.

0 replies

lpugin · 2021-12-20T07:34:40Z

lpugin
Dec 20, 2021
Maintainer

Yes. And the third thing we need to keep under control is the executable size, but I don't think this will be a matter with the point raised here. I completely agree that we should try reducing memory usage. Up to now, we have been monitoring speed pretty well but memory usage not so much. One thing that has been avoided, though, was to have object members and to use pointers with object allocated only when needed instead. That can make a pretty big difference.

To give you some background on the multiple inheritance approach, it was adopted from the beginning on when we integrated the modified version of LibMEI. Originally, LibMEI uses mixins
https://github.com/DDMAL/libmei/blob/2a309c52eea5326ad0ec7fa71bd6d9c5e67bcfe1/src/modules/cmn.h#L56-L78

But since in C++ we can do multiple inheritance, I thought that would be more appropriate. Then in an Accid object you can do accid.GetAccid() and not having to do accid.m_Accidental.GetAccid(). That is, having something that reflects <accid accid="f"> regardless of the structure of the Schema in the ODD. Of course, changing from mixins as member objects to multiple inheritance did not make memory usage (regarding attributes) any worse or any better. All this to say that the multiple inheritance is something that makes Verovio code intuitive and very close to MEI and that I think we should keep this.

I am not absolutely sure that this is what you have in mind, but I think changing to an interface / implementation structure would not necessary mean that we need to abandon the multiple inheritance. Instead of having only the Att* classes generated by LibMEI, we would have a pair of classes for each, Att* and Att*Impl. The Att* classes would not hold the attributes but only a pointer to a Att*Impl object and serve as a proxy to the attributes. The Verovio classes would still inherit from the Att* classes. For example AttAccidLog would remain the same, exempt for its member which would be only a pointer to the impl:

class AttAccidLog : public Att {
public:
    AttAccidLog();
    virtual ~AttAccidLog();

    /**
     * @name Setters, getters and presence checker for class members.
     * The checker returns true if the attribute class is set (e.g., not equal
     * to the default value)
     **/
    ///@{
    void SetFunc(accidLog_FUNC func_);
    accidLog_FUNC GetFunc();
    bool HasFunc() const;
    ///@}

private:
    AttAccidLogImpl *m_impl;
};

All the accessors in Att* classes would check the allocation of the implementation. For example:

AttAccidLog::SetFunc(accidLog_FUNC func_)
{
    if (!m_impl) m_impl = new AttAccidLogImpl();
    m_impl->SetFunc(func_);
}

Checking presence would be:

AttAccidLog::HasFunc()
{
    return (m_impl ? m_impl->HasFunc() : false);
}

Resetting would delete the implementation:

AttAccidLog::Reset()
{
    if (m_impl) {
        delete m_impl;
        m_impl = NULL;
    }
}

Changing this can be done simply by refactoring LibMEI and it would be totally transparent for the rest of the Verovio codebase, I think.

This means that we can keep the Att* class registration. However, we can also look at improving this. This is typically something that should be static, and making it a member for every object was clearly not a thoughtful design. One idea would be to make Object::HasAttClass(AttClassId attClassId) const virtual and to override it for every class that implements an Att* class. For example, BTrem would be

bool BTrem::HasAttClass(AttClassId attClassId) const
{
    static const std::vector<AttClassId> s_attClasses { ATT_BTREMLOG, ATT_TREMMEASURED };
    if (std::find(s_attClasses.begin(), s_attClasses.end(), attClassId) != s_attClasses.end()) return true;
    // otherwise look for the parent class
    return LayerElement::HasAttClass(attClassId);
}

That would replace all the RegisterAttClass(). We need to think about the interface classes, but we can probably find something similar.

What do you think?

0 replies

brdvd · 2021-12-20T13:21:13Z

brdvd
Dec 20, 2021
Collaborator Author

Yes, multiple inheritance definitely has nice benefits and I absolutely prefer something like accid.GetAccid() over accid.m_Accidental.GetAccid(). I want to add some further analysis to illustrate possible memory savings of your suggested approach. For this I have picked some arbitrary attributes and evaluated their sizes:

Class	Stores	Size
AttBtremLog	enum	16
AttExpandable	enum	16
AttGraced	enum, double	24
AttPedalLog	enum, string	40
AttSlurRend	enum, subclass	40
AttNoteGes	enum, enum, char, int	24
AttAccidental	enum	16
AttColoration	enum	16
AttCurvature	string, string, enum	64
AttDurationRatio	int, int	16

Now compare this to an attribute storing a single pointer. This would have a size of 16 bytes. Hence it would save memory only in half of the cases and it would add memory for all attributes which actually have encoded values (i.e. where the impl pointer is non zero). But this brings me to another curiosity: why does an attribute storing a single enum (or pointer) require 16 bytes? There are two reasons for this:

We use virtual destructors, so each attribute additionally stores a vtable pointer.
We use full sized enums.

So if we keep the always-multiple-inheritance-approach and do changes on low level (LibMEI generation), then I would suggest the following:

Drop virtual attribute destructors and replace them by protected non-virtual destructors. This is according to the guideline 4 from the Herb Sutter Blog. Dropping the vtable pointer reduces the size of each attribute by 8 bytes. It would further enforce that we can use attributes only by inheritance. As far as I can see there is only one place where we currently don't follow this rule (in class OptionStaffrel), but that should be fixable. Maybe we should give it a try.
Use custom enum sizes. This should not be problematic and we should be able to choose between uint8_t and uint16_t depending on the number of enum entries during LibMEI generation. Example:

enum data_CERTAINTY: uint8_t {
    CERTAINTY_NONE = 0,
    ...

Doing both of these should reduce the size of single enum attributes to 1 byte (2 byte if the enum has >256 entries which rarely happens) and should also perceivably reduce the size of all other attributes.

Concerning the registration, I really like the idea of storing the attribute ids once per class and not per object. This should also save lots of memory.

0 replies

lpugin · 2021-12-20T13:44:37Z

lpugin
Dec 20, 2021
Maintainer

Thanks for the analysis. Using custom enum sizes looks good, and so does dropping virtual destructors.

One quick question: do you think it would make sense to have some classes using the interface / implementation pair and some not? When the attribute class stores only a one-byte enum, then having the interface / implementation pair would actually increase the memory because of the additional extra pointer, wouldn't it? However, when the attribute class uses strings or vectors, then having the interface / implementation pair is useful. So it seems to me the it would be optimal to use both approaches. Which one to use is something we could indicate here https://github.com/rism-digital/libmei/blob/develop/tools/includes/vrv/config.yml on a case-by-case basis. We could also detect it based on the type of the attributes, but that might be over the top.

0 replies

rettinghaus · 2021-12-20T13:47:12Z

rettinghaus
Dec 20, 2021
Collaborator

Maybe @ahankinson could add his thoughts here, because some of the proposed improvements for LibMEI could/should go upstream?

0 replies

brdvd · 2021-12-20T14:08:44Z

brdvd
Dec 20, 2021
Collaborator Author

Yes, a mixed approach should be perfect and using interface / implementation whenever the attribute stores a container object (string / vector) sounds like a good rule of thumb.

Also note that for interface / implementation we might consider using std::unique_ptr which has no memory overhead against a pointer. There also is std::optional, but according to this, it has some memory overhead...

0 replies

ahankinson · 2021-12-20T14:09:34Z

ahankinson
Dec 20, 2021
Maintainer

I talked it over with @lpugin -- basically the Verovio fork of libmei changed the inheritance model, and added data types, so it's quite different already. Also, I'm not a C++ dev, so I'll defer deciding the best course of action to those who are!

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Saving memory & attribute composition #3065

{{title}}

Replies: 8 comments

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Saving memory & attribute composition #3065

brdvd Dec 17, 2021 Collaborator

Replies: 8 comments

craigsapp Dec 17, 2021 Collaborator

brdvd Dec 17, 2021 Collaborator Author

lpugin Dec 20, 2021 Maintainer

brdvd Dec 20, 2021 Collaborator Author

lpugin Dec 20, 2021 Maintainer

rettinghaus Dec 20, 2021 Collaborator

brdvd Dec 20, 2021 Collaborator Author

ahankinson Dec 20, 2021 Maintainer

brdvd
Dec 17, 2021
Collaborator

craigsapp
Dec 17, 2021
Collaborator

brdvd
Dec 17, 2021
Collaborator Author

lpugin
Dec 20, 2021
Maintainer

brdvd
Dec 20, 2021
Collaborator Author

lpugin
Dec 20, 2021
Maintainer

rettinghaus
Dec 20, 2021
Collaborator

brdvd
Dec 20, 2021
Collaborator Author

ahankinson
Dec 20, 2021
Maintainer