Thursday, February 23, 2006

Open source technical software - dead end?

The open source movement has had a large impact on the software industry in recent years. I for one am a huge fan of Firefox and Thunderbird from the Mozilla project and a modestly satisfied user of OpenOffice. Can scientific software expect the same kind of open source benefits?

Firstly, it has to be said, on the smaller scale end of software, scientific software is dominated and always has been by open source code. When small specialist problems are solved in software, then there is often no commercial opportunity, so developers put code into the public domain rather than do nothing with it.

But the reality is very different with large software products. Complexity of software development rises disproportionately with the size of source code. This requires teams, which in turn require coordination and more people. It rapidly becomes expensive. And it becomes too involved to have developers spend just small amounts of their time on it.

When you look around at large scale open source projects they are nearly all Commercial software that has been released. e.g. Firefox was once Netscape. OpenOffice was once StarOffice.

On this basis one can probably assume that any future major scientific software open source project already exists. It is either one of the existing open source projects (Axiom, Maxima, Reduce, Scilab) or it is one of the commercial products if its owners fail to make it pay or fail as companies (Matlab, Mathematica, Maple, Mupad etc). Since these all these products are the central revenue of their companies, we can rule out the code being donated, like IBM released Eclipse. With the exception of Mupad, their companies look pretty healthy too.

So let us turn our attention to the three main large open source projects Axiom, Maxima and Scilab. Do any of them have the potential to become the killer free science app? Well no, and let me argue why:

1) The knowledge required to be a contriubutor is a level above that required to be a contributor to, say, Firefox. You need to be both a competant programmer AND a competant mathematician to be able to code complicated and debug algorithms. The world may be full of keen students who want the kudos of getting a few lines of code into Firefox, many fewer who are actually good enough programmers, and fewer still who might know the math to add to, say, Axiom. This means that these projects will never be able to keep up with the development rate of the commercial packages. They are behind already, and will only fall further back.

2) Large technical computing systems have a lot of internal dependancies. You may be able to take a task like "change the bookmark mechanism" in Firefox and be pretty confident that that team won't affect the way your browser renders pages. Ask someone to add features to, say, linear algebra, and you might affect equation solving, statistics, ODE solvers and many more areas that each affect other components. This again raises the technical requirement of your contributors, reducing the pool further AND adds a significant management overhead and system design overhead.

3) To support the costs, free software needs major financial contribution. OpenOffice gets the backing of Sun, who would like to sink Microsoft Office, Eclipse is backed by lots of big companies like IBM (who want it as a platform for their commercial tools). Only Scilab has any backers. The Axiom group recently commented that they had their first funded project. A $4500 grant from the Google "Summer of code" one time charity. And that 2 month project does not appear to have been delivered after 5 months!

4) The major selling point of free software is "it's free". But of course there are other costs to installing new software. Principally the training cost. It takes next to no time to learn how to use a word processor so OpenOffice is still almost free. But the learning time for scientific software can be major. So if it takes a month to learn then Axiom is now a $2000 product. If you add the same cost to, for example, Mathematica it is a $4000 product and arguably with the better support of the large user base, lots of materials, books training etc, that comes with a commercial system, perhaps it is only $3000. Free is still cheaper but the relative difference is much smaller than it is with simple consumer products.

5) Every successful open source project has established itself as THE alternative. Firefox is THE choice of browser after IE (who really has heard of Opera?) OpenOffice is THE office suite etc. This isn't true yet in the science software. Two must die to give the third any recognition, and if MuPAD were to join the group, the problem would get worse.

5 comments:

Edwood Ocasio said...

I use R, Scilab an gretl in my teaching at Univ. Puerto Rico - Cayey Campus and we find them more than adequate.

Those three FLOSS projects I mentioned look strong. However, Maxima and Axiom, I do fear for them.

We need more blogs about scientific computing. There are groups, mailing lists, but blogs are difficult to find.

Keep blogging!

Scientific Computing said...

Thanks for the comments.

R and gretl have, I think, the advantage that statistics has a larger base of users. But equally, they face larger competition.

For me Scilab is the most interesting to watch at the moment- it has some worthwhile momentum behind it.

Anonymous said...

Let me try to address your comments point by point, at least with respect to Axiom.

1) scientific software is dominated by open source code.

Well, this IS science, after all. We are focusing on the long term. Scientific software differs from, say Firefox, because the answers
will be the same 30 years from now. Thus as we continue to build and contribute more software the amount of useful mathematics grows.

2) Complexity of software development rises disproportionately with the size of source code.

Well, yes and no. Axiom tries to remain close to the standard mathematical frameworks. If you understand the mathematics then it is much easier to understand Axiom. Because we try to maintain this correspondence and we factor the code based on this model it is possible to write short functions that work in many contexts. Thus there is a near-linear rise of complexity in Axiom.

The second issue of complexity has to do with understanding. We have rewritten Axiom into a literate programming form and are actively working to merge the research papers with the code. The goal is to allow someone to sit and read both the theory and the code. We believe that this is critical to the long term viability of scientific software.

2a) And it becomes too involved to have developers spend just a small amount of their time on it.

This is a problem? :-) I'd spend all of my time on it if I could. It's a whole new field of computational mathematics. Centuries ago mathematics was "just a hobby" and some people spent their lifetime solving a single problem. Mathematics has always been a poorly funded activity and computational mathematics doubly so. But you're living in the unique moment in history where computers and math have collided.

Actually I do get bug reports and fixes from people who have found and fixed a single problem.

3) When you look around at large scale open source projects they are nearly all Commercial software that has been released... [snip]...On that basis one can probably assume that any future major scientific software open source project already exists.

I agree. Axiom has 36 years and hundreds of man-years of research and development work. It was a sold as a commercial competitor to Mathematica and Maple. It was developed at IBM Research. I do not believe that it is possible to invest that much time and effort again to solve the same problems.

On the other hand consider the idea that all companies eventually go out of business. What happens to computational mathematics if Mathematica or Maple fail? Where does all the science go? Macsyma died with the company that owned it.
Bill Schelter got an old copy from DOE and rescued some of Macsyma as Maxima but the rest was lost.

Mathematica and Maple represent a large amount of intellectual capital. If either company fails will the owners simply give away that value? If the answer is no then we end up with a huge black hole in computational mathematics.

If the answer is yes then who will maintain it? I have never seen the source code for either software but my commercial experience is that software is VERY poorly documented and I suspect the MMA and Maple are no exception. If I hand you a million lines of C code with no comments can you maintain it?

I was one of the original developers of Axiom. I got my own code back after 15 years. It contained few comments. It took about 2 years to fully recover and make available all of the functionality.

So if MMA and Maple die, and in the unlikely event they give away the software, who will maintain it? And if these programs are no longer available what happens to the computational mathematics written for those systems? Consider the programs written in Macsyma and you'll see the likely future result.

4) The knowledge required to be a contributor is a level above that required to be a contributor to, say, Firefox. You need to be both a competant programmer AND a competant mathematician.

You've ignored the fact that some of the PhD work has been done using Axiom and that this will likely continue.

You've also ignored the fact that computational mathematicians LIKE to do science even as a hobby.

There are, by my estimate, about 500 researchers worldwide who contribute to this field. Some of them see the value of having their work openly available in Axiom. We only need one person to code an algorithm and the world benefits.

So the small number of people in this field limits the number of contributions but it only takes one.

Also, you're ignoring the fact that Axiom is also a very large common lisp program. The non-mathematical portions can be worked on by any programmer. Even non-programmers can contribute. I've had a discussion this week with someone who wants to translate my Axiom Tutorial book. That contribution does not require either programming or mathematics.

There is a lot of work to do (e.g. GUIs, documentation, porting, optimization, doyen (sourceforget.net/projects/doyencd) )
which is non-mathematical and anyone can contribute.


4a) This means that these projects will never be able to keep up with the development rate of the commercial packages. They are behind already, and will only fall further back.

You've assumed a linear path here. Axiom is not trying to follow MMA and Maple. For example, Axiom can be used online for free (wiki.axiom-developer.org).

Assuming a linear path the commercial systems depend on external developers (the ones you claim are so few) in order to develop most new algorithms. Suppose those people turned to Axiom instead? Suddenly the commercial systems would be "behind".

5) Large technical computing systems have a lot of internal dependancies

Well, yes, but Axiom is designed to make it possible to write very few lines of code and have it work everywhere. You can add new functions and have them "just work" in domains you didn't write. This is not true for the commercial systems and it is a fundamental design issue so it isn't something that will change.

6) To support the costs, free software needs major financial contribution.

I, personally, am the only source of funding for Axiom. It costs money to support our server, pay for the books (I give away free copies), pay for the ISBN, pay to give "free talks", pay for the travel, etc. But every hobby costs money :-)

I talked to the NSF about grant funding and they said they will not support any activity that has a commercial counterpart. So as long as MMA and Maple exist I cannot get grant funding.

Our only real need for money is to support PhD students and researchers in various areas of computational mathematics. That is, unfortunately, beyond my personal means to do.

5a) ... And that 2 month project does not appear to have been delivered after 5 months.

It's open source. Make it happen :-)

The work is there and is having its impact. You're assuming some sort of a schedule. Or you're assuming that 2 months worth of work generates a high-quality, polished result. These things take time and, like good cooking, will appear when they are ready.

Google had a great idea and I hope they continue doing it. In fact, I wish they would expand it to longer terms so we could work with a student and their advisor (a professor?) on a year-long project.
Such continuous-basis grant funding would allow us to make great progress.

6) The major selling point of free software is "it's free".

Nope. Not in the case of Axiom. The major selling point is that it does what you want done and does it correctly. Pay many thousands of dollars for commercial software, find a bug, and wait until next year for a fix. And then pay for the new version with the fix. Pay nothing for Axiom, find a bug, and either fix it yourself or get a fix from CVS. We "release" Axiom about every 2 months and it costs nothing for the "new" version.

As for documentation we have just released the first Axiom book (search google for "Daly Axiom") which is an Axiom Tutorial. There are more books currently being written. In fact, the books will contain the actual source code of the system so you can see both the theory and the implementation.

7) Every successful open source project has establised itself as THE alternative.

Really? How do you define success? Axiom is downloaded thousands of times. People who are not subscribed to the mailing list can quote my own words to me at conferences. People are buying the book. Conferences are being held.

And you've ignored the timeframe. OpenOffice is many years in the making. Firefox is an offshoot of Mozilla which grew from Netscape.
Years. Many years.

Axiom's theme is "The 30 Year Horizon". We're in it for the long haul and, unlike Firefox, computational mathematics just builds on itself.

I define Axiom's success as being a high-quality piece of software that gives correct answers. We're successful if we convince computational mathematicians to combine their research and implementation as literate programs. We're successful if we contribute to the long term future of the science.

You see science software as a competition and I see it as cooperation. We're in it for the science.

Tim Daly
Axiom Lead Developer

Dave said...

"This means that these projects will never be able to keep up with the development rate of the commercial packages."

Of course, but it doesn't matter. It doesn't matter how long it takes an open-source project to get to a functional stage, it just matters that it gets there, that it is good enough. That is the key to Free Open Source Software (FOSS).

If you look closely enough, it seems fairly obvious that Software follows a standard trajectory: awful free code that is a really good idea, a rapidly-improving commercialized version, and then solid open-source code. Once the FOSS version gets to the the "good enough" stage then the commercial versions start doing stupid GUI revamps and feature-bloats, trying to show they're still somehow better (aka Office 2007). However, once there is a free good-enough version, it doesn't really matter. At that point, rapid commercial development doesn't help, it just adds more and more useless features that nobody wants, or unnecessary changes simply to justify charging for the next version. Once code hits the "good enough" stage, FOSS has all the advantages.

Have some patience, FOSS alternatives will eventually get to whatever niche you operate in. It is inevitable; it just takes time. If you're the impatient type, then contribute.

Personally, I'm trying to wean a particular instructor off of Minitab. My search brought me here, for some reason. Oh well, no solution yet, but there will be. Sooner or later R will get a Minitab-like front-end, or something else will come along. Until then, well, patience is a virtue.

John Dudley said...

Mobile App Development Services in Dwarka is helping businesses in becoming future-ready mobile organizations. Acetech plans, implements and manages mobility solutions that bring measurable business value.