Hello ProjectFortress! Questions and discussion on scripting, pattern matching, goal-directed programming, and other nondeterministic idioms.
Subject: Hello ProjectFortress! Questions and discussion on scripting, pattern matching, goal-directed programming, and other nondeterministic idioms.
Newsgroups: gmane.comp.lang.fortress.general
Date: 2008-06-29 13:01:06 GMT
Hello Project Fortress!
I've been lurking around the project for a while now. As someone who's done a lot of scientific programming, I was routinely dismayed by the linguistic support for what I wanted to do, and set to work learning language design, sometime around last summer. I made a list of things I wanted to try, things I thought I'd really want when I got back to scientific programming. Happily, once I discovered Fortress, I found it had most of them, and many more times good ideas that I hadn't expected -- especially in parallelism. And so I am very excited.
I'm especially glad that you're using mathematical notation. There just weren't enough operators in other languages for everyone to do what they wanted. But the improvements in readability are enormous. And, we can't forget Kernighan's law:
"Debugging is twice as hard as writing the program, so if you write the program as cleverly as you can, by definition, you won't be clever enough to debug it."
That's a big problem people encounter in scientific code. It's not impossible to write, in current languages, but it's really, really hard to read, because all of the management of the mathematical objects (say, vectors and matrices and big numbers), all of the parallelism and numerical checks, needs to be handled inside the main code. And it mucks everything up. So improving notation, and parallelism, and language extensibility, and providing all sorts of powerful primitives and the ability to carry around properties, all this excites me.
I've been kicking a bunch of questions for a while now. And I wanted to put in some more study before I asked them. But I read the exhortation on the Fortress Bootcamp today, to not "hold back your impressions!" So I'll bite my lip and make a fool of myself: here are some of mine:
There are what seem like competing goals you're working under. HPC seems to demand things like unboxed variables, and static type checking, and rock solid exception handling, and a great security model.
On the other hand, I see great potential in Fortress, or a Fortress derived language, as a *scripting* language for scientists and mathematicians. I can imagine something like it replacing Mathematica as the default first language people learn, and the default language people work and think in. This is partly because the parser and character set are so powerful, and the language is so extensible, that the natural expression of many statements will, with the right libraries, take little effort to translate into Fortress. Also, with parallelism and easily parallelized constructs built so deeply into the language, I can imagine the computations that slow maple and mathematica down just blazing on Fortress, and that would be really great.
But for it to be a nice scripting language, the syntax seems a little heavy. At least at first, that's my impression. I look at the conjugate gradient example and think 'my, that's beautiful, it's like the canonical encoding of what that algorithm *means*'. But then I look at even Hello World and I'm a little daunted, because it still looks like there's an awful lot of code around that isn't core to the meaning. What's all this 'export' and 'executable' and run =, and component stuff? The canonical representation of hello world is 'print "hello world"'!
One of the core philosophies of the scripting world is that you can get up and running in an extremely short time. And then it's a really, really simple matter to move next to simple calculations, and variable assignment, and loops, and so on. Access to a powerful and helpful interactive interpreter/debugger, like Python, or Mathematica, is really helpful too (if you go one step further, and typeset code and equations, it will be even better).
These features make a smooth gradient for absolute beginners to climb. Eventually, they'll get all the way up to being able to build library code. I think that smoothing that gradient out could really be crucial for Fortress. A goal should be that it's a language people will want to teach students *first*. It is so different from previous languages that I see only a few of the old guard, those really in the know, switching -- the rest will have to pick it up early on. But that's okay, you have long term goals.
So one of my impressions, which I'll just put out here, is that having a 'script mode' might be a very handy thing: for code outside of a 'component' score, by default, assume that the component has its name being its filename, which by default exports (or even just plain runs) an executable. This simplifies things for beginners and everyone else -- you can just name components by the filename, you don't need to write the name twice, which was one of the things that annoyed me about Java, even before I started scripting. There's a name for this philosophy used in the Python world: don't repeat yourself.
For example, code blocks are delimited with 'end'. Are they needed? Python does without them, and I think that pays major dividends in readability. And it takes less work to type. Programmers by default seem to follow the path of least typing. Is there anything in the parser, or in the semantics, that need it? And I imagine that the default method for coding fortress would eventually include a realtime typesetting ide, so in fact readability would be even easier than in Python -- you could control the block spacing to guide the eye.
Finally, I would like to make the case for some exploration of declarative, nondeterministic programming. Fortress indulges in both, in some sense, due to implicit parallelism. I don't really have anything settled in my mind yet, but there are directions that are interesting to me. I have been playing with the pattern matching mechanisms of Erlang and Haskell (of which multiple assignment, like python and ruby have, is a subset). It's not a fully declarative language like Prolog -- they used only the stuff they could make fast. But what I've found is that it's extremely useful as a 'do everything' control flow abstraction. I'm afraid it's too late tonight for me to give examples, but if you ask me I can write some later. For now, you should know that Erlang uses pattern matching for variable assignment. Erlang uses it for sanity checks. Erlang uses it for string manipulation. Erlang uses it to simplify calling and writing anonymous functions. Erlang uses to it implement typing. Erlang uses it to implement array/list/data structure slicing. Erlang uses it to implement object orientation. Erlang uses it to streamline exception handling. Erlang actually has an if statement, but nobody uses it because case statements, with the limited pattern matching that Erlang gives, are so incredibly useful.
In principle pattern matching is the sort of thing I could add on later. But some of the biggest wins come from applying it in these nooks and crannies, and they're really everywhere. It could be a very valuable thing to have right at the core of the language. So I'd like to see,at this early stage, if it's a feasible thing to look into. Erlang's pattern matching is one of the most powerful, time-saving abstractions I've come across, and they're not even taking it as far as they could.
The other thing is that the pattern matching is an abstraction that mathematicians are really used to. People have an innate sense of the 'shape' or 'pattern' of some expression, and can easily see you taking one thing, and splitting it up, and putting it in some other thing. It's very easy to grasp, conceptually. That's almost what mathematical calculation is: everytime you see a derivation with an = or an arrow what you're really seeing is some pattern matching and symbol manipulation, of which variable assignment is a tiny subcase. Even consider the way people write case statements, or multi valued functions: here, for example. http://mathworld.wolfram.com/HeavisideStepFunction.html
Exception handling is another thing I'm interested in. Unfortunately, I think the standard way of handling exceptions causes an awful lot of clutter.
Here are some examples.
Very, very often, you know that something could cause an exception, but you want that exception to pass silently, because it's either not important to the program, or it's part of a chain of actions for which you know what to do if any of them fail.
For example (in python, taken from http://code.causes.com/blog/drying-out-deep-checks):
We found ourselves very often writing code with conditions that looked like:
if object && object.child && object.child.valid?
do_something(object.child)
end
In this case all we really want to do is verify that object.child.valid? returns true, but writing
if object.child.valid?left us vulnerable to the dreaded
They then implemented the 'try' method, which just maps and uncaught exception -> nil (false), and on anything else it just yields control.
if try { object.child.valid? }
This is a common idiom. Another one, taken from the wikipedia page on Icon. http://en.wikipedia.org/wiki/Icon_(programming_language)
For instance, we can write a program to copy an entire input file to output in a single line:
while write(read())
When the read() command fails, at the end of file for instance, the failure will be passed up the chain and write() will fail as well. The while, being a control structure, stops on failure, meaning it stops when the file is empty. For comparison, consider a similar example written in Java-based pseudocode:
try {while ((a = read()) != EOF) {
write(a);
}
} catch (Exception e) {
// do nothing, exit the loop
}
It's a simple notion - things eventually fail, so just tell the program what to do when it does. Here's another example.
Icon includes several generator-builders. The alternator syntax allows a series of items to be generated in sequence until one fails: 1 | "hello" | x < 5 can generate "1", "hello", and "5" if x is less than 5. Alternators can be read as "or" in many cases, for instance:
if y < (x | 5) then write("y=", y)
You can see how this could have a wonderful synergy with Fortress. Goal directed execution allows you to multicast control flow. Often in combinatorial code, for example, you need to find some solution, but you don't really care which one you get. You could write
if test(possible_solution_generator) then return solution
and be done with it.
Also in exception handling. there may be cases where an elaborate exception heirarchy is really valuable, but in my (limited) experience, just keeping things flat and using message passing is more flexible, and easier. One of the problems is that people, in the Java world, use the exception heirarchy as a method to handle control flow -- some things are unexpected errors, but others just prompt a response. The trouble seems to be that this demands that you define, ahead of time, a class heirarchy of exceptions, and then carefully plan how each function will throw or handle different types. It's a big, front loaded design task, and something that people tend to get wrong on a first pass. Usually you can't forsee errors. I don't know if this is like this in Fortress, but in Java, the idiom is to make a new exception class, and then throw it, mentioning on all possible calling functions how this exception is supposed to be handled. This is a major task. And people pollute the heirarchy with exceptions that are simply 'events' that you need to handle. Frankly, I think this is better solved with co-routines.
-----
Wow. This has gotten way longer than I wanted it to be. But it's 6am now, and I better get some rest. Hopefully, it's not so long that nobody responds, but, if so, I guess it's worth of refinement.
In any case, I wish everyone on the project all the best.
Danielle Fong
RSS Feed