Home
Reading
Searching
Subscribe
Sponsors
Statistics
Posting
Contact
Spam
Lists
Links
About
Hosting
Filtering
Features Download
Marketing
Archives
FAQ
Blog
 
Gmane
From: Barend Gehrels <barend <at> geodan.nl>
Subject: Re: [Review] ITL review starts today, February 18th
Newsgroups: gmane.comp.lib.boost.devel
Date: Sunday 28th February 2010 16:48:23 UTC (over 7 years ago)
Hi List, Hartmut, Joachim,


This is my review of the Interval Template Library (ITL). YES, I vote 
+1, this library should be accepted into Boost. It is very very useful. 
During my review my enthousiasm was only growing. Even if there was a 
thing confusing on the operators (for me). Yes, I can use this library 
in real life and it will solve problems for me.



1) WHAT I DID
For this review I've done three projects of which two are related to my 
day-to-day or hobby programming efforts.

a) The first project is related to genealogy. I took up the idea of Phil 
and generated an interval map of livetimes of all persons in my 
genealogical database. This database contains ~3000 persons.
Many of the persons are still alive. ITL contains a right open interval 
but that denotes the right-hand-side excluding, while I need an interval 
that is unlimited at one side. Of course I could workaround using 9999 
as year of death. An additional problem is that for many people death 
date is unknown. I don't expect ITL to handle fuzzy intervals so I 
ignore this.

Everything went smoothly. There was not any delay noticable (see below). 
I started with a provided example but changed some things, such as using 
a const_iterator and const references, not yet using the documentation, 
and everything is standard and works as expected. The function I drafted 
is below.

b) The second project is related to planning and "leave requests". I've 
implemented this some years ago and remember that it got quite complex, 
with weekends, national holiday days and people working sometimes 40 
hours a week, but sometimes not on Friday afternoon etc. So I tried if 
Boost.Interval would have simplified my live there. And yes, it would 
have done for sure!

Starting with the manpower example I made adaptions, actually only 
simplifying it, plus subtracting days-of-per-week, until I had what I 
wanted. The whole routine is in one screen now and definitely logic and 
simple. The drafted functions are below.

c) I did some small other studies as well, especially on the map 
operators, and to draft an alternative "intro for dummies"... Of this, 
pieces are here in this review.

Let me repeat that I found this library VERY USEFUL and that I'm 
surprised that there are not many reviews yet.



2) DOCUMENTATION
The documentation is extensive and good. I've some small remarks about 
the introduction.

When I started I was confused by the definitions of interval_set and 
interval_map, and when to use them. It might be explained a little 
better for people not having worked with intervals before, probably with 
some sample code and images. The documentation contains, at the start:

"It means that we can use an interval_set or interval_map like a set or 
map of elements. It exposes the same functions."

and then gives the example " mySet.contains(42)". But contains is NOT an 
method of std::set, but YES from interval_set. This was confusing.

I then expected a sample for the map below but not there... Furthermore, 
that threelined set-sample means that the *interval* 42-42 is inserted, 
which was also not clear to me at first sight... I would change that 
section. I would suggest something like this:

void intro_sets_for_dummies()
{
    namespace itl = boost::itl;

    // STANDARD std::set works with values
    std::set std_set;
    std_set.insert(3);
    std_set.insert(7);
    BOOST_FOREACH(int const& value, std_set)
    {
        std::cout << " " << value;
    }
    //std::cout << " " << (std_set.contains(3) ? "YES" : "NO"); //not 
implemented in std::set
    std::cout << std::endl;

    // BOOST.ITL itl::interval_set works with intervals instead of values
    itl::interval_set itl_set; // Interval's of integer, e.g. 
seconds or years
    itl_set.insert(itl::interval(3,5));
    itl_set.insert(itl::interval(4,8)); // Overlap with previous is 
detected automatically
    itl_set.insert(1); // Interval of (1,1)
    BOOST_FOREACH(itl::interval const& interval, itl_set)
    {
        std::cout << " " << interval.first() << ".." << interval.last();
    }
    std::cout << " " << (itl_set.contains(3) ? "YES" : "NO") << std::endl;
    //  1..1 3..8 YES
}


The same for map. The itl::interval_map is REALLY very useful. My 
suggestion:
void intro_maps_for_dummies()
{
    namespace itl = boost::itl;

    // STANDARD std::map
    std::map std_map;
    std_map[3] = 4;
    std_map[7] = 8;
    for(std::map::const_iterator it = std_map.begin(); it != 
std_map.end(); ++it)
    {
        std::cout << " " << it->first << ": " << it->second;
    }
    std::cout << std::endl;

    // BOOST.ITL itl::interval_map
    itl::interval_map itl_map;
    itl_map += std::make_pair(itl::interval(3,5), 4); // add 4 to
[3,5]
    itl_map += std::make_pair(itl::interval(4,8), 8); // add 8 to 
[4,8] -> will generate 3 intervals
    itl_map -= std::make_pair(itl::interval(5,5), 2); // subtract 2 
-> will split another interval
    for(itl::interval_map::const_iterator it = 
itl_map.begin(); it != itl_map.end(); ++it)
    {
        itl::interval const& interval = it->first;
        std::cout << " " << interval.first() << ".." << interval.last() 
<< ": " << it->second;
    }
    std::cout << std::endl;
    // OUTPUT:  3..3: 4   4..4: 12   5..5: 10   6..8: 8
}

This would give me a clearer picture how it works. At the same time I 
now realize that during our discussion on +=, -=, |= , I didn't realize 
enough that += really does mathematics behind the screens. So it adds 
values where to the map on the specified interval, it subtracts values 
from the map on the specified interval. This goes beyond "set theoretic 
operations", "set union" and especially "set difference". See below for 
more on this.

I realize that a similar sample is there in the documentation, and that 
this principle, "aggregate on overlap" is well explained there. However, 
a sample with the numeric values is (for me) somewhat more clear than 
the given sample with names which are added to a set; the sample of 
interval_map with itl::set probably does too much at a time...

But let me repeat that the overall documentation is impressive and
complete!


3) PERFORMANCE
Phil claimed that:
> "consider trying to store the lifespans of people in a genealogy 
> program: recording a few thousand people living over a few hundred 
> years would approach the worse-case space complexity and need 
> megabytes of storage"
This was part of my motivation to do project 1 above, because these few 
thousand people over a few hundred years is exactly what I have. 
Boost.Itl calculates and reports the livespans in 0.031 seconds. So, 
measured in time, no problem at all. I compared it with another 
implementation which run in 0.032 seconds, so the same. I will not say 
that it cannot be enhanced somehow or whatever, I didn't study it in 
depth, I would only say: the library is well performing, also if the 
input set grows larger. It should not be rejected for the assumption 
that it could have been built otherwise.

Besides that, Boost.Itl does more than normal intervals: the 
interval_map associates values to intervals.


4) SOME NOTES
- the samples do not use it, but sets and maps are working with Boost.Range
- sets are also working with BOOST_FOREACH
- in the code samples, "using" is used in many places (or everywhere). 
While many Boost libraries do this, I would say that this is unclear for 
users visiting for the first time. What is that "date"? Is it a 
gregorian date, or an ITL date? So I would say something like
    namespace gr = boost::gregorian;
    namespace itl = boost::itl;

and use this like "gr::date" and "itl::interval<...>" everywhere, such 
that the origin of all those terms is immediately clear. As said, this 
is not criticism on Boost.ITL, it is a hint for enhancing Boost docs for 
first time users. I'm often struggling with the origin of terms 
especially if two libraries are used together. I realize we've done this 
ourselves in Geometry as well and I think changing it would make it more 
clear.


5) OPERATORS and FUNCTIONS
During the review period I had a discussion on the list with Joachim 
about the operators and the naming of functions. However, I thought that 
the += and |= were both doing a pure union, but it appears that they are 
doing a union PLUS a (sometimes mathematic) addition. So my current 
opinion is:

For sets (o.a. interval_set), the following operators are appropriate, 
as also used in Boost.dynamic_bitset:
&= for intersection
|= for union
^= for symmetric difference
-= for difference
So this is how it is implemented but I would remove += as a synonym for |=

For interval_map, it is currently much less clear to me now. The 
operators do two things at the same time, which is confusing me.

Let's see the following code pieces:
1|    itl::interval_map map1, map2;
2|    map1 += std::make_pair(itl::interval(3,5), 4);
3|    map2 += std::make_pair(itl::interval(4,6), 5);
4|    map1 |= map2;
This gives:  3..3: 4   4..5: 9   6..6: 5. So it is a set-theoretic 
union, PLUS mathematical addition. Aggregation on overlap. It IS very 
useful, but it does two things at the same time, and should be 
implemented and described carefully.

Changing the last line to intersection:
4|   map1 &= map2
gives  4..5: 9, so it is a set-theoretic intersection on intervals, PLUS 
mathematical addition. I still agree.

But now:
4|   map1 -= map2
gives  3..3: 4      4..5: -1  This is actually unexpected. It is not 
set-difference, because that would leave only 3..3 in the set and remove 
4..5 completely. So what it actually does is subtracting 5 from all 
there was before in map1, ONLY mathematical operation (minus), no 
set-theory here...
The documentation describes it like this: "Subtraction on Maps 
implements a map difference function similar to set difference", and it 
forwards to a link with more info about this ("While addition and 
subtraction on Sets are implemented as set union and set difference,"), 
I still don't agree, it is not set difference here.

To be complete:
4|   map1 ^= map2
gives  3..3: 4      6..6: 5. So yes, this IS the set-symmetric 
difference and (by nature of that) does or does not mathematical 
addtion, that is not important here.

Therefore I find the interval_map operators still confusing, even more 
confusing than before. Are they doing mathematics or set-theory? So I 
would suggest something like this:
4|    map1 |= something(map2, std::plus);
to make it more explicit. This is the set-theoretic union where on all 
overlaps the two sets are added, and
4|    map1 |= something(map2, std::minus); where the two sets are 
unioned, and the values of map2 are subtracted from map1. This is 
slightly different from the current -= implementation.
4|    map1 |= something(map2, std::multiplies); item, but values are 
multiplied

And then the same for &= (intersection), -= (set theoretic difference).
Then ^= probably need no policy because it always handled the intervals 
which are not common, an operator is not applicable here.
There are combine functors described (I saw later), but they are used 
different (I think) than this suggestion.


This would make it less confusing AND probably would give it more 
functionality, because the *= is currently not implemented (and might be 
reserved for scaling).

This was map + map. The same applies for the operator-= on map, 
interval, so:
1|    itl::interval_map map;
2|    map += std::make_pair(itl::interval(3,5), 4);
3|    map -= std::make_pair(itl::interval(4,6), 3);
result in  3..3: 4    4..5: 1

This could actually be stated clearer by using the std::map approach, on 
intervals, I tried it intuitively but it does not compile:
 map[itl::interval(3,5)]+= 4;
 map[itl::interval(4,6)]-= 3;
I would not have any problems with this, and I would expect still doing 
aggregate on overlap behind the scenes, and not set-theory, resulting in:
3..3: 4    4..5: 1    6..6: -3


6) STANDARD QUESTIONS
> - What is your evaluation of the design?
>   
The library is well designed. The comments above were only on one detail.

> - What is your evaluation of the implementation?
>   
I had only a glance to the implementation. It is clear, nice short 
functions everywhere, and well documented. It is build on top of the 
std::set and std::map, which is reported to be problematic, but I don't 
think it is. My test runs very fast.
> - What is your evaluation of the documentation?
>   
Impressive and quite complete. The introduction might add a section or 
tutorial for real-starters
> - What is your evaluation of the potential usefulness of the library?
>   
Very useful. I mentioned two projects where the library would have 
helped me if it had been into Boost earlier, and I'm considering 
rewriting them such that the problems make use of the library, and 
simplify them in that way.

> - Did you try to use the library?  With what compiler?  Did you have any
> problems?
>   

I used it using Visual C++ 2008 Express Edition and I didn't have any 
problems.

> - How much effort did you put into your evaluation? A glance? A quick
> reading? In-depth study?
>   
About 10 hours spread over three days.

> - Are you knowledgeable about the problem domain?
>   


Finally, thanks to Joachim for developing and submitting this excellent 
library, and to Hartmut for managing this review.


Regards,
Barend Gehrels,
Amsterdam





Ad a:

void CreateTimeline(std::ofstream& stream)
{
    namespace itl = boost::itl;
    typedef itl::interval_map map;
   
    map livetimes;

    for (TGenealogyVertexIterator vi = vertices(m_graph).first;
            vi != vertices(m_graph).second;
            ++vi)
    {
        CGenealogyVertexProperty p = GetProperty(*vi);
        if (p.indi.type == TYPE_PERSON && p.indi.birth.date.hasdate)
        {
            int death_date = p.indi.death.date.hasdate ? 
p.indi.death.date.year : 9999;
            livetimes += 
std::make_pair(itl::interval(p.indi.birth.date.year, death_date), 1);
        } 
    }

    stream << "LIVETIMES: " << std::endl;
    for (map::const_iterator it = livetimes.begin();
        it != livetimes.end(); ++it)
    {
        boost::itl::interval const& when = it->first;
        stream << when.first() << " - " << when.upper() << " : " << 
it->second << std::endl;
    }
}



Ad b:
void calculate_leave_request()
{
    namespace gr = boost::gregorian;
    namespace itl = boost::itl;

    itl::interval scope
                (
                    gr::from_simple_string("2010-04-01"),
                    gr::from_simple_string("2010-05-31")
                );

    itl::interval_set weekends_plus_festival_days(scope);
    weekends_plus_festival_days -= weekends(scope);
    weekends_plus_festival_days -= gr::from_simple_string("2010-04-05"); 
// Eastern
    weekends_plus_festival_days -= gr::from_simple_string("2010-04-30"); 
// Queensday
    weekends_plus_festival_days -= gr::from_simple_string("2010-05-13"); 
// Ascension day
    weekends_plus_festival_days -= gr::from_simple_string("2010-05-24"); 
// Whitsun

    itl::interval_map request_in_hours;

    // Add 8 hour per day
    request_in_hours += std::make_pair(scope, 8.0);
    request_in_hours &= weekends_plus_festival_days;

    // Subtract 4 hour for all Fridays
    {
        gr::date date = gr::next_weekday(scope.first(), 
gr::greg_weekday(gr::Friday));
        gr::week_iterator it(date);
        while(it <= scope.last())
        {
            itl::interval scope(*it, *it);
            request_in_hours -= std::make_pair(scope, 4.0);
            ++it;
        }
    }

    report("Request", scope, request_in_hours);
}

// where report is:
template 
double report(std::string const& caption, Scope const& scope, Map const& 
map)
{
    namespace itl = boost::itl;
    namespace gr = boost::gregorian;

    std::cout << caption << ": " << scope.first() << " - " << 
scope.last() << std::endl;

    double hours = 0;
    for(typename boost::range_iterator::type it = 
boost::begin(map);
        it != boost::end(map); ++it)
    {
        itl::interval const& interval = it->first;
        // Add 1, because Boost.Date counts [2010-05-01 ... 2010-05-02 
as one day]
        int days = 1 + gr::date_period(interval.first(), 
interval.last()).length().days();
        std::cout << interval.first() << " - " << interval.last()
             << " -> " << days << " * " << it->second  << std::endl;

        hours += days * it->second;
    }
    std::cout << "Total hours: "  << hours << std::endl;
    return hours;
}

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
 
CD: 3ms