Cache Functions

$cache functions generalise $deps

A more general alternative to $dep variables is to use $cache functions. Consider the following example where $cache functions are associated with the dependent variables y1 and y2


$struct X isa IObject :
    model
    {
        int x1;
        int x2;
        int x3;
    }
{
    $cache int y1() const
    {
        return x1 + x2;
    };

    $cache int y2() const
    {
        return y1() + x3;
    };
};

Note that $cache functions always have const qualification. Note the const on the signature of y1:


    $cache int y1() const

Currently $cache functions must be inlined within the class definition.

The following code


CSpaceCreator cspace;
CSpaceTxn txn;

X ds;
RegisterNonGcObject(&ds);
ds.x1 = 10;
ds.x2 = 20;
ds.x3 = 50;

std::cout << "ds.y1 = " << ds.y1() << '\n';
std::cout << "ds.y1 = " << ds.y1() << '\n';
std::cout << "ds.y2 = " << ds.y2() << '\n';
std::cout << "ds.y2 = " << ds.y2() << '\n';
ds.EvictDgsNodes();

generates the following output:


[Calc y1]    ds.y1() = 30
ds.y1() = 30
[Calc y2]    ds.y2() = 80
ds.y2() = 80

This output shows that the results of the function call have been cached, so that subsequent calls to the function retrieve the cached value.

$cache functions with input parameters

$cache functions can have one or more input parameters. In that case the calculated values are cached in a map keyed by the input parameters in the implementation. The CSpace garbage collector may occasionally evict map entries (this is done on an LRU basis for all evictable DGS nodes in the CSpace).

Each map entry represents a separate dependent node in the dependency graph that records in-nodes and out-nodes. Also like a $dep each map entry performs early quiescence testing (this can be turned off for all map entries by specifying $cache^).

In this example the following dependency graph is formed:

                g()
                 |
                 |
  f(1)         f(2)
                /  \
               /    \
              /      \
             b1      b2

$struct X isa IObject :
    model
    {
        int x[4];
    }
{
    $cache int f(int a) const
    {
        return (a == x[1]) ? x[0] : std::max(x[2],x[3]);
    };

    $cache int g() const
    {
        return f(1) + f(2);
    }
};

$struct X isa IObject :
    model
    {
        int x[4];
    }
    public CWnd
{
    $cache <<async>> std::optional<int> f(int i) const 
        with int v = x[i]
    {
        return expensive_function(v);
    };

    $cache std::pair<int,int> g() const
    {
        int count = 0;
        int sum = 0;
        for (int i=0 ; i < 4 ; i++)
            if (auto r = f(i))
            {
                count++;
                sum += *r;
            }
        return { count, sum };
    }

    void draw() const
    {
        auto [count,sum] = g();
        if (count < 4) DrawSpinner();
        else           PrintSum(sum);
    }

    $dep void screen
        invalidate { CWnd::Invalidate(); }
        calc       { draw(); };

    void OnPaint() { screen(); }
};

The following C# code illustrates the idea of awaiting on four tasks, allowing them all to proceed concurrently, and avoiding blocking a thread



var task0 = f(0);
var task1 = f(1);
var task2 = f(2);
var task3 = f(3);

await Task.WhenAll(task0, task1, task2, task3);

$struct C :
    model
    {
        int b1;
        int b2;
    }
{
    $cache int f(int a) const
    {
        std::cout << "[Calc f(" << a << ") = (a==1) ? a : max(b1,b2)]";
        return (a == 1) ? a : std::max(b1.read(),b2.read());
    };

    $cache int g() const
    {
        std::cout << "[Calc g() = f(2)]";
        return f(2);
    }
};

The following code


ceda::CSpaceCreator cspace;
ceda::CSpaceTxn txn;

C c;
std::cout << "Setting b1 = 10 b2 = 20\n";
c.b1 = 10;
c.b2 = 20;

std::cout << "c.f(1) = " << c.f(1) << '\n';
std::cout << "c.f(2) = " << c.f(2) << '\n';

std::cout << "\nf(1) and f(2) have been cached in a map\n";
std::cout << "c.f(1) = " << c.f(1) << '\n';
std::cout << "c.f(2) = " << c.f(2) << '\n';

std::cout << "\nSetting b1 = 100.  This invalidates f(2) but not f(1)\n";
c.b1 = 100;
std::cout << "c.f(1) = " << c.f(1) << '\n';
std::cout << "c.f(2) = " << c.f(2) << '\n';

std::cout << '\n';
std::cout << "c.g() = " << c.g() << '\n';

std::cout << "\nSetting b2 = 50.\nThis invalidates f(2) but not g() because of early quiescence test on f(2)\n";
c.b2 = 50;
std::cout << "c.g() = " << c.g() << '\n';

c.EvictDgsNodes();

generates the following output:


Setting b1 = 10 b2 = 20
[Calc f(1) = (a==1) ? a : max(b1,b2)]    c.f(1) = 1
[Calc f(2) = (a==1) ? a : max(b1,b2)]    c.f(2) = 20

f(1) and f(2) have been cached in a map
c.f(1) = 1
c.f(2) = 20

Setting b1 = 100.  This invalidates f(2) but not f(1)
c.f(1) = 1
[Calc f(2) = (a==1) ? a : max(b1,b2)]    c.f(2) = 100

[Calc g() = f(2)]    c.g() = 100

Setting b2 = 50.
This invalidates f(2) but not g() because of early quiescence test on f(2)
[Calc f(2) = (a==1) ? a : max(b1,b2)]    c.g() = 100

Why $cache functions are important

Consider a grid layout that wants to cache the created child views


$cache-^ ptr<IView> GetChild(ssize_t row, ssize_t col) const
{
    return GetViewModel().CreateView(row,col);
}

In this case there is in effect a dependent node of the DGS associated with the creation of each child, so lots of opportunity for fine-grained dependencies. If instead a $dep which is a vector<ptr<IView>> is used then all the children are replaced whenever any of them need to be replaced.

The bigger problem with coarse DGS graphs is the appearance of cycles that wouldn't otherwise be there.

Subtle example of $cache functions which check for early quiescence

The following example illustrates one of the subtleties of the DGS. In this case there is a vector<int> member L, and a cache function f(i) which simply caches the ith element of L. f must be invoked with valid index values to avoid array bounds check failures.

None of the cache functions have ^ in the declaration:


    $cache int f(int i) const
    $cache int size() const
    $cache int sum() const

Therefore all these functions check whether their output has in fact changed before passing on hard-dirtiness. This behaviour can be seen in the output. After L is cleared, sum has only been marked as soft dirty. Therefore it invokes the read barrier on its inputs to see whether they have really changed.

Importantly this is done in the order in which the calls were originally made, and also as soon as one of the inputs has been seen to change the remaining inputs are not tested.

In this case it means that after L has been cleared, size() is called but not f(0), f(1) or f(2) so there is no array bounds check failure.

Dependency graph:

              sum
           /  /  \  \
          /  /    \  \
         /  /      \  \
      size f(0)  f(1) f(2)
         \  \      /  /
          \  \    /  /
           \  \  /  /
                L

$struct X :
    model
    {
        xvector<int> L;
    }
{
    $cache int f(int i) const
    {
        int v = L.read()[i];
        std::cout << "f(" << i << ") = " << v << '\n';
        return v;
    }
    $cache int size() const
    {
        std::cout << "size()\n";
        return (int) L.read().size();
    }
    $cache int sum() const
    {
        std::cout << "sum()\n";
        int n = size();
        int s = 0;
        for (int i=0 ; i < n ; ++i)
        {
            s += f(i);
        }
        return s;
    }
};

ceda::CSpaceCreator cspace;
ceda::CSpaceTxn txn;

X ds;

ds.L.push_back(1);
ds.L.push_back(2);
ds.L.push_back(3);
std::cout << "sum = " << ds.sum() << '\n' << '\n';

ds.L.clear();
std::cout << "sum = " << ds.sum() << '\n';

ds.EvictDgsNodes();

Output:


sum()
size()
f(0) = 1
f(1) = 2
f(2) = 3
sum = 6

size()      <-------- size() is called by read barrier of sum (because sum is soft dirty)
                      Note that f(0), f(1), f(2) are not called even though they are also
                      inputs of sum() at that time.
sum()
sum = 0

Input arguments

$cache functions support any number of input arguments. For example:


$cache- float foo(int a, int b, float c) const { return a+b+c; }

The inputs to $cache functions can now be passed by const reference. Xcpp automatically uses the underlying value type when defining the key type of the map.

$cache functions can be defined on complex input types. For example


$cache float64 foo( const xvector<float64>& x ) const
{
    return std::accumulate(x.begin(),x.end(),0);
}

An input parameter can be a const reference to a type which is not reflected. For example:


struct NotReflected {};
$cache int32 foo( const NotReflected & x ) const
{
    < etc >
}

Passing the output by reference

There is now specialised support for passing the output by reference so it can be calculated or accessed without any copying.

In that case make the function return void and pass the output by reference in the first argument. For example


$cache-^ void GetY(Output& y, Input x) const
{
    < etc >
}

This generates a function with signature


Output const& GetY(Input x) const;

This works with any number of input args and they can be passed by const reference.

Early quiescence testing

$cache functions support ^ like $dep in order to disable early quiescence testing. E.g. this is appropriate when the return type is a pointer to a heap allocated object.


$cache-^ X* foo() const { return new X; }

Reflection

$cache functions are currently not reflected. You can ensure reflection won't happen in the future using a - character just after the keywoard cache.

IObject visiting

The cached function call results are automatically visited (they may contain pointers to IObjects). Also the object containing the $cache function is visited.

DGS Logging

The following virtual function is defined on the base class of a DGS node


virtual xstring Name() const;

Xcpp generates implementations of this function for you, to support some very handy diagnostics.

Currently logging can only be enabled by recompiling the cxObject library (the following flags appear in DGNode.cpp):


@def bool trace_DGS_attach = false
@def bool trace_DGS_recalc = false
@def bool trace_DGS_invalidate_indep = false
@def bool trace_DGS_warn_inactive = false

It's handy to have separate flags for these. For example if you just want to see when dependents are being recalculated then only turn on trace_DGS_recalc. If the GUI is unresponsive you might start only turning on trace_DGS_invalidate_indep to verify the invocation of write barriers. If that looks ok then try turning on trace_DGS_warn_inactive to see whether the problem is that inactive dependents are being calculated. If that looks ok then turn on trace_DGS_recalc to see what actual recalculations are being performed. It's often useful to turn on trace_DGS_attach to investigate the actual dependency graph which is being formed, but you might easily have thousands of edges in a typical application.

When using $cache functions the diagnostic name of the $dep in the map is displayed as the qualified name of the function plus the values of the input arguments in brackets - so it looks just like a function call. E.g. DrawGridLayoutMixin::GetChild(0,0)

Example of diagnostics:


DGS calc wxGLCanvasForView::m_sizeAnalysisAgent
  DGS: 1 Attaching edge DrawGridLayoutMixin::GetSuggestedWidth() --> wxGLCanvasForView::m_sizeAnalysisAgent
  DGS calc DrawGridLayoutMixin::GetSuggestedWidth()
    DGS: 2 Attaching edge DrawGridLayoutMixin::GetSuggestedGroupColWidth(0) --> DrawGridLayoutMixin::GetSuggestedWidth()
    DGS calc DrawGridLayoutMixin::GetSuggestedGroupColWidth(0)
      DGS: 3 Attaching edge DrawGridLayoutMixin::GetSuggestedColWidth(0) --> DrawGridLayoutMixin::GetSuggestedGroupColWidth(0)
      DGS calc DrawGridLayoutMixin::GetSuggestedColWidth(0)
    DGS: 4 Attaching edge DrawGridLayoutMixin::GetSuggestedGroupColWidth(1) --> DrawGridLayoutMixin::GetSuggestedWidth()
    DGS calc DrawGridLayoutMixin::GetSuggestedGroupColWidth(1)
      DGS: 5 Attaching edge DrawGridLayoutMixin::GetSuggestedColWidth(1) --> DrawGridLayoutMixin::GetSuggestedGroupColWidth(1)
      DGS calc DrawGridLayoutMixin::GetSuggestedColWidth(1)
        DGS: 6 Attaching edge ScrollViewModel_Mixin_1::panView_ --> DrawGridLayoutMixin::GetSuggestedColWidth(1)
        DGS calc ScrollViewModel_Mixin_1::panView_
        DGS: 7 Attaching edge DrawPanViewMixin::child_ --> DrawGridLayoutMixin::GetSuggestedColWidth(1)
        DGS calc DrawPanViewMixin::child_
        DGS: 8 Attaching edge DrawGridLayoutMixin::GetSuggestedWidth() --> DrawGridLayoutMixin::GetSuggestedColWidth(1)
        DGS calc DrawGridLayoutMixin::GetSuggestedWidth()
          DGS: 9 Attaching edge DrawGridLayoutMixin::GetSuggestedGroupColWidth(0) --> DrawGridLayoutMixin::GetSuggestedWidth()
          DGS calc DrawGridLayoutMixin::GetSuggestedGroupColWidth(0)
            DGS: 10 Attaching edge DrawGridLayoutMixin::GetSuggestedColWidth(0) --> DrawGridLayoutMixin::GetSuggestedGroupColWidth(0)
            DGS calc DrawGridLayoutMixin::GetSuggestedColWidth(0)
              DGS: 11 Attaching edge DrawGridLayoutMixin::GetSuggestedColWidthAccordingToChildren(0) --> DrawGridLayoutMixin::GetSuggestedColWidth(0)
              DGS calc DrawGridLayoutMixin::GetSuggestedColWidthAccordingToChildren(0)
                DGS: 12 Attaching edge DrawGridLayoutMixin::GetChild(0,0) --> DrawGridLayoutMixin::GetSuggestedColWidthAccordingToChildren(0)

Example use in the multi-res-image

Conceptually the PersistStore provides both synchronous and asynchronous cache map functions to get the object with a given OID:


class PersistStore
{
    $cache ptr<IPersistable> GetObject(OID oid) const;    

    $cache ptr<IPersistable> <<async>> AsyncGetObject(OID oid) const;
};

The multiresimage demo uses an async $cache function to represent a cache updated asynchronously for the decompressed image tiles.


class MultiResImage
{
    ...
    $cache <<async>> Image GetImageTile(int level, int x, int y) const
        with Jpeg jpeg = LoadJpegTile(level,x,y)
    {
        return DecompressJpeg(jpeg);
    }
};