These are the original answers to an an interview published in macnews.de and that were translated in German by the interviewer.

 

macnews.de: Charles, how did the idea evolve to use Xgrid for your calculations?

There were mainly three reasons for choosing Xgrid. First, the program used for the calculations was written on a mac, for a mac. Second, Xgrid is extremely easy to setup and use, even for the 'agents' that want to donate their idle CPU. Third, my calculations are of the 'embarassingly parallel' type, which means the calculations can be divided in many different tasks completely independent of each other. This is the easiest kind of parallelization, so Xgrid is perfect for that.

macnews.de: Can you sum up the history for us? You started with 8 computers in March 2004, right?

I first heard about Xgrid in January 2004 (when the 'Technology Preview 2' was released), when I was already facing some limitations in terms of computation power. I was still working on improving and adding important features to my simulation/fitting app, so it was another 2 months before I gave Xgrid a first shot.

It turned out to be extremely easy to setup and use. It took about a week to get a first functional plug-in (this is the piece of code that talks to Xgrid) that could get my program to run on several machines. I first ran the computations with the few macs that we had in the lab (these are the 8 computers of March 2004). It was running smoothly, so I enrolled a few more mac-users in the lab where my wife works (we promised them some home-made french cakes). A few other people in Stanford voluntered as well, and that got me to about 40 computers in May 2004. I had built a little web site to explain the installation process, so it was time to add it to my signature. That got me a few more agents. Some of these new agents posted threads on ArsTechnica.com and on MacGeneration.com, which brought new waves of enrolment. Latest addition: there was an article on MacNewsWorld.com last week and a post on MacSlash.org this week-end. So the cluster has been growing like crazy these last few days. It might hit 200 GHz on Monday...

The most common agent is a male computer-savvy mac-owner who wants to give Xgrid a try. Enroling to an existing cluster like Xgrid@Stanford is a great way to see Xgrid in action (if you don't have any use of Xgrid, it can otherwise be a little frustrating and the cpu-meter is not going to move to exciting numbers). There are also people really interested in distributed computing and that have participated or are participating in other projects. And the third kind of agents are of course those who have been promised home-made french cakes.

macnews.de: How much power do you have at the moment? Where are the most people based?

Right now, on Saturday Aug 14 at ~11 pm pacific time, about 190 GHz, coming from ~130 computers not used by their owners. About half of them are based in the US, and another half is from Europe. A few computers are based in Australia. I have a couple in Asia, and one in South America (Caracas!). At some point, I want to have more precise statistics about that and come up with a nice map, because this is something the 'agents' are asking me a lot. A first version of the map actually exists if you look at the last slide of the movie on the 'Project' page of the web site. For your german readers, I looked more specifically and there are at least 15 computers there... I am French myself and quite happy with the 30+ french agents.

macnews.de: What exactly do you calculate?

I will try to make that brief and understandable. Forsome of the gory details, I recommend the 'Project' section on the web site for more details

First, some background on the biology. The lab I am working in is run by Brian Kobilka. The general goal of the lab is to understand the functioning of a large family of ~700 different proteins, called the G-protein-coupled receptors. Each of these receptor recognizes a different molecule (like a neurotransmitter or a hormone), and is involved in a different function (like heart regulation, bone regulation, or the sense of sight or smell). One aim of the lab is to determine the number of steps and the kinetics of the activation process: what happens to the receptor when it is activated by a drug? This is a very important goal for pharmaceutical industries and human health agencies, as the G-protein coupled receptors represent half of the current targets for medications already prescribed or in development.

Now, about the calculations. The lab has generated some biophysical data, obtained using fluorescent probes attached to a receptor. We want to fit the data with a biochemical model and answer some of the questions explained above. We don't know which biochemical model will work best, so we are testing several of them. For each model, we actually run many different fits. This is what takes a lot of computer time. The reason to run many different fits is to make sure we do not miss some parameter values that would fit. If a model does not fit after trying this hard, we can dismiss the model with high certainty, and this makes the 'right' model look even stronger. So, in fact, dismissing models is an important part of the process.

Such biochemical models have important implications for drug design and drug development. I am not going to pretend that will we wake up one day to find our Macs have solved the cure to stroke or cancer. However, it is generally admitted in the field that the models currently used are simplistic, and that more complex models would be very valuable for new developements in pharmacology and drug design.

macnews.de: Do you already feel a little bit like Seti@Home? Will we see similar things using Xgrid?

I am not very familiar with Seti@Home, but more with Folding@Home, also based in Stanford, and also focused on protein structure. I do not feel too much like Folding@Home. It is very different in many aspects. The major difference is the size: dozens of computers, versus hundreds of thousands of computers. I just have one server. I have a short term goal (probably less than a year). All in all, my project fits well with the cluster size, just like Folding@Home ambitions are very consistent with their huge computational power.

In fact, my project was not intially designed with distributed computing on mind. Actually, even after starting the cluster with the macs in the lab, I did not really envision anything like that. After being invited by Apple to give a talk at the WWDC, I finally felt that maybe this Xgrid project could interest other people in the mac community.

I really hope that Xgrid will develop in a mature product that more people will use for projects of all sizes. There is great potential here, and Apple will probably want to make Xgrid as versatile as possible. They might also decide to make it first really good for small/medium-size clusters, like for a company or a university that wants to make the best of unused resources. For good examples of that, look at James Reynolds web site [http://www.macos.utah.edu:16080/Documentation/xgrid/povray.html] or the Wolfgrid project [http://packmug.ncsu.edu:16080/wolfgrid/]. But projects similar to mine might be more common soon, and you might even see web sites listing all similar Xgrid projects, with comments from the enrolled agents. So every potential CPU donator could choose its favorite project, and there would be some competition between the different projects (I have read about that idea several times already). With a Xgrid pref pane part of the system, and just one address to change in one field of that pref pane, it would really make such things very easy... Even with just 5% of the market, that still make a lot of Macs.

macnews.de: Xgrid is in its second development preview at the moment. Is it pretty stable already? What do you miss, feature-wise?

It has been very stable for me until I reached ~50-100 machines. Above that, I have had more crashes of the server, and the stability seems to be directly dependent on the amount of data transfer when submitting a job (that includes the executable and the actual data) and receiving the results. I am in the process of reducing that data transfer to almost zero and I really think it will make a difference (I ran some dummy jobs once for testing). This unstability was a little disapointing of course, but I also realize this problem of data transfer has to be an issue at some point, when everything goes trough just one machine. So I will rely for the data distribution on a separate web server.

On the agent side, there does not seem to be serious issues. As expected, the agents are completely insensitive to the server crashes. The most annoying thing is a bug in the screensaver that sometimes shows no activity when the cluster is actually running at full speed... For the people that want to see the cluster speed, it can be frustrating. This is why I put on my web site a live gauge showing the cluster speed.

I miss a few features, of course. It would be nice to have a more general API, not just a plug-in architecture. More documentation would be nice too. I would like some built-in stats. Just logging to a text file all the server events would be useful. In fact, I don't really ask for much more. I am quite happy with the current Xgrid for my needs. A few minor bugs need to be fixed, and even better stability would be nice.

macnews.de: How does Xgrid work on the Internet? Is it secure, i.e. the traffic encrypted? Do people have to stop working to let it run?

I am not a security expert, so take my answer with caution... I believe the traffic is encrypted, yes. I know for sure that it is the agent that contacts the server. Each communication, always started by the agent, uses a temporary arbitrary port. This means the agent does NOT have any port open, and that allows me to have agents even behind firewalls. That also makes the whole process very secure on the agent side.

The job is run on the machine by the user 'nobody', who has the lowest privileges and cannot mingle with system files or private user files. A new job can start only when the computer has been idle for more than 15 minutes. The job is run with nice=20, which means it has the lowest priority and will give up the CPU to all the other processes if needed. Thus, even when running, my process is not going to slow down the user applications.

macnews.de: Xgrid is usable on all Mac OS X machines, right? Which ones have the most power?

As of Technology Preview 2, Apple's Xgrid runs on all Mac OS X machines with OS X.2.8 or higher installed. When one wants to run his program on different machines over Xgrid, he has to make sure that the binary will work on Jaguar and Panther.

Daniel Cote has developed a Linux agent (http://www.novajo.ca/xgridagent/). However, I can't use Linux machines because my program was compiled for the Mac and is dependent on many Mac OS X specific librairies.

The machines with the most power are... euh... the dual 2.5 GHz G5??? The jobs are sent first to the most powerful machines. Dual-processor machines do get 2 jobs to run (as separate processes).

macnews.de: What other applications do you see for Xgrid?

Xgrid will be best for 'embarassingly parrallel' problems. For tighter problems, I think people will turn to existing technology and dedicated clusters.

It seems that an important field would be in multimedia applications, like rendering, special effects,... This is not my field, so probably I should not talk about it, I am just going to say stupid things.

In science, 'embarassingly parallel' problems are not that common, I believe, and most of the current clusters are 'hard' clusters made of dedicated machines. The use of 'loose' clusters like those used by Seti@Home or Folding@Home usually means you have to give up some efficiency, but this is compensated by the cheaper and higher computational power you get. There are already several projects out there using distributed computing over the internet (see for example a list on http://www.aspenleaf.com/distributed/). Xgrid does make the process of building such a cluster much easier, and is also more general (it is very easy to run another program without updating the agents). If agents are developed for platforms other than the Mac, Xgrid could be a nice alternative to other similar software.

I am certainly not an expert in cluster and distributed computing, so I should probably stop here with my comments!

macnews.de: Charles, thanks for your time.

Thanks for the opportunity to talk about Xgrid. It is very stimulating!