Friday, May 20, 2011

The joy of solving a problem

I often wonder what level of technical understanding the people who read this blog have. But then I think, "What does it matter?" No matter how technically inclined you are, there is always something you don't know about. I can take that even farther - in software engineering alone there a hundreds of things I know nothing about! If someone found out that I'm a software engineer and launched into a description of their latest web application that uses technology ABC to act as the front end for database XYZ, they'd probably lose me in a matter of minutes. So, I'll try to treat this as if you don't know much about the technology behind my story. But in the end, my problem is really not much different than the problems any of us has had to solve at one time or another.

One of my responsibilities is to oversee the building of our software product so that all the changes everyone in our group has made on the project each day get incorporated into a new version that can be installed the following day. We do this every day and it's called the Daily Build. What I REALLY do is set up commands to a computer program that actually does the work. That program is called Hudson. You can set it up to, for instance, build the project and make an installer for it every day at 4 AM (so the installer is ready for us in the morning). But setting up Hudson can be a big job and when things change, as with any computer program, you have to make sure everything is configured correctly or the Daily Build isn't ready for testing in the morning. One of the things that has to be set up is to have a Java development kit installed on the computer where Hudson runs. Java is a programming language and to write programs for it, you need other programs that take your program (written in text) and turn it into instructions the computer can understand. This set of programs is the Java development kit and this changes from time to time as bugs are fixed and improvements are made. Since we had just released a version of our product and were getting ready to start work on the next version, it seemed like a good time to update the Java development kit on the Hudson computer. This is normally an easy process but on the Windows operating system, telling Hudson where the new, updated version is located is a little more involved. I did that and ran some tests that showed that it was working and went home for the day.

The next morning, I checked my office mail from home to see if any surprises were waiting for me when I would get to work later and I saw a message that the Daily Build had failed. The Hudson program sends this with a list of errors whenever the Daily Build is not able to complete. The error list looked strange and I thought it might just be a simple matter of a power failure at the office and , since it looked like the power was fine now, I just restarted the build (doing that from home is a story for another day but suffice it to say that I can do pretty much anything from home that I can do in the office but a little more slowly and a little less efficiently). That second build didn't work either. So, I sent a note to everyone in the group that I'd tackle this problem first thing when I got in.

When I got to work and looked at the situation, I saw more problems. The tests that I'd run the night before no longer worked. As I tried a few minor changes to the Hudson configuration, more and different errors showed up. Then, even stranger things were happening as time went on - without my even doing anything! I began to think that the disk drive on the computer was going bad. If that wasn't the problem, it looked like the Java development kit update had not installed correctly. But why would it work at first and then fail overnight? The best thing I did to solve the problem was to not panic. When I tried to update the Java development kit again and it failed (in a different way), I was able to look at the clues and figure out that I needed to completely remove the Java software for the computer and start from scratch. I did that. Then, after many hours, I was getting different errors in the Daily Build - but these made much more sense. Now, I was able to fix each problem as it appeared and finally, after about five hours of work, I was able to declare it fixed and to produce a good Daily Build.

Boy did that feel good! Every day I solve one or more problems and it's nice to be able to do that but when you're faced with a potentially catastrophic problem, there is a special joy in diagnosing the problem and fixing it. I thought back to my earlier post Cheering for Sports Teams where I thought about not getting involved with professional sports teams that often disappoint you and that I should just follow teams that never lose (like the Harlem Globetrotters). But then you miss the special joy when those teams that have disappointed you in the past come through and do great things. Just as I could shy away from taking responsibility for things that need to be done every day with precision and timeliness, it's so sweet when you're able to do that under tough conditions. Those are the high points of an engineering profession and they're not as fulfilling without the hard problems.

No comments: