Ming's Coding Blog: 2005

Tuesday, December 06, 2005

Jakarta Commons CLI 1.0 Is Not Very Good

I wanted to do some quick command-line argument parsing in Java, so I grabbed the CLI component from Jakarta Commons and gave it a whirl. Unfortunately, this component is not very good. I fit it into a jar of about 28KB, which is not large but not that small either. The resulting command-line parser is able to identify various command-line arguments, but it isn't able to do more advanced parsing such as automatically converting arguments to integers (like jargs). It also isn't able to make use of Java annotation coolness (like args4j). It has a nice feature for automatically generating help messages, but the generated option list doesn't really match what the parsers will accept. For example, the help message will say that "-gui" is a valid option, but when using the PosixParser, the parser kept misreading the "-gui" as "-g" and rejecting it for some reason. I had to switch to the BasicParser to get it to do something reasonable.

Overall, it's nothing that I can't live with, but I was hoping for something a little better from Jakarta Commons. I think they should junk the multiple parsers, and just focus on a single more full-featured one.

Thursday, November 17, 2005

Recording with Java Sound on the Macintosh

I was trying out my little Java VoIP application with a friend of mine who owned a Macintosh, and it didn't quite work right. Besides the whole dubious Java 5.0 support for the Macintosh, I couldn't open a TargetDataLine at 8000 kHz.

On PCs, you can record and playback audio at arbitrary sample rates. That's not the case on the Macintosh. After getting another friend to do some probing, I found that the Java sound system on Macintoshes can playback at arbitrary sample rates, but can only record at 44100 Hz. You can record at 8/16-bit depths, with (un)signed types, and at different endianness though, I think.

Saturday, November 12, 2005

VoIP in Computer Games

About a year and a half ago, I started to become curious as to why there were so few computer games with VoIP support. It should be relatively easy, shouldn't it? You just grab an open source sound codec like Speex, throw in some sound code, add some network packet buffering code, mix, and voila--VoIP support. Unfortunately, I became distracted by something else and never got around to looking at the issue more deeply.

Just recently, I became curious again, so I decided to code up a small VoIP application. And, as expected, it did end up being fairly easy. I have to admit that I did waste a week going down the wrong path--I configured my sound card wrongly, resulting in a lot of feedback in the signal, so I spent a week examining algorithms for echo and feedback cancellation. But once I figured out what I was doing wrong, I was able to bang out a working little VoIP application in about a week.
The only part that wasn't really obvious (to someone who's been thinking about the problem for the past year) was the packet buffering code. Conceptually, it should be easy, you just store incoming packets somewhere and wait a few milliseconds before using them to compensate for packets coming in out of order and with jittered delay. But that leaves a lot of unresolved issues such as what should you do if you use up all the packets waiting for you? You can skip a packet (i.e. assume that they are lost) or skip a "heartbeat" (i.e. assume that you're consuming packets too quickly and wait for the packet to arrive). And what happens if you receive too many packets?

The scheme that I ended up using was to trigger everything based on the difference between the highest packet sequence number in the packet buffer and the next packet sequence number that the program is expecting. So whenever my program tries to grab the next packet, if the difference in sequence numbers is below a certain threshhold, then I skip a heartbeat; otherwise, I grab the next packet, skipping it if it hasn't arrived yet. If the difference in sequence numbers ever gets above a certain threshhold, then I advance the expected sequence number to get below the threshhold. I also had different threshholds for deciding when to adjust for clock-skew. Oddly enough, my sound card seemed to playback audio at a different sample rate than what I specified for some reason (it could quite well be my fault), so I put in some code for stretching incoming sound samples to take up more or less time. All of these threshholds should probably be adaptive somehow, but I couldn't be bothered, so I just set them to some default values.

Other than that, everything was pretty straight-forward. In Windows Java, the System.currentTimeMillis() method is very low-resolution, which caused my server to alternate between thinking it was very fast or very slow. Fortunately, newer versions of Java have a nanoTime() method, which provides a better high-resolution timer. I also had a bit of a problem with silence detection because my sound card didn't use 0 as a baseline. Sound samples seemed to center around -100 or so. I was able to get around this by taking the max and min sound samples of every chunk of sound, averaging them, and then exponentially averaging them with my previous estimate of what the baseline was. This seemed to work as an adaptive baseline. I'm not sure how to erect an adaptive silence detection algorithm on top of that though since I put a static algorithm there instead.

So the conclusion of all this is that VoIP support isn't included in more computer games due to laziness on the part of developers. It really is very easy to do. There may be some latency issues where computer game code is stealing so many CPU cycles that the VoIP code can't get CPU resources at regular enough intervals that it causes latency, but I'm sure it's something that game developers could work out with thread priorities and stuff like that.

Wednesday, September 14, 2005

Adding a Little Colour to Sine Waves

I've never really been much of a audio creation buff, but when I heard that Propellerhead were releasing their ReBirth audio program for free, I quickly downloaded it because I've always been curious as to what all the fuss was about. ReBirth is apparently an emulator of the TB-303, which is a piece of techno hardware from the 80s that could be used to generate weird instrument sounds and to sequence those sounds into some simple tracks. It also emulates some other pieces of hardware (namely some rhythm boxes). I actually found ReBirth to be really difficult to use and completely confusing, so I uninstalled it.

Fortunately, I also downloaded the free Rubberduck program from d-lusion, and it was quite fun and easy to use (not that I have sufficient talent to get the program to do anything that sounds good, but at least I can make those crappy sounds easily as opposed to with ReBirth). After playing with it for a while, I noticed that if I turned off all the filtering effects, I could still make some nice instrument sounds just by combining sine waves and square waves.

This was significant to me because I occasionally create little Java games, but it's hard to get any music into those games. Java includes support for MIDI output, and the full Java VM even comes with Roland sound patches for its instruments, but the existence of these patches isn't guaranteed, so you can't rely on them for game music. Instead, you have to supply game music via sound files, but then you need to generate these sound files somehow. I purchased some cheap music composition software, but it creates sound files by playing the music out through the computer sound card and recording it. Since my sound card is really cheap, the resulting sound files end up containing much too much noise to be useful. There are open-source programs that can take MIDI songs and output sound files directly, meaning you don't get any noise, but these programs all use sound patch sets of questionable legality, so I didn't want to use them. But if I could just make my own instrument sounds by combining some sine waves and square waves, then I could just create a program for converting MIDI songs into sound files myself without the problems of noise or legal encumbrances.

Unfortunately, once I wrote a program for combining sine waves and square waves in various ways, I realized that the sounds that my program was generating were puny and weak. Even the simple sine wave sound generated by d-lusion's Rubberduck was much fuller and colourful than the sine wave generated by my program.

The documentation for d-lusion's Rubberduck said that they pass their sound through a low-pass filter, but I thought I had turned all the filtering parameters off. Besides, a low-pass filter shouldn't really affect a sine wave. In any case, I'm really weak with signal processing, so I couldn't really design such a filter anyways.

Well, anyways, I was looking over the "Cookbook formulae for audio EQ biquad filter coefficients" document by Robert Bristow-Johnson which is, essentially, the dummies guide to creating various filters using digital feedback mechanisms. When I looked at what Rubberduck did to square waves (it rounded out the corners), I suspected that maybe I could just sort of randomly feedback things into itself and maybe get a similar effect.

So that's what I did:

y[n] = 0.8 * y[n-1] + 0.2 * x[n]

where x[n] is a sine or square wave and y[n] is the final sound output, and the result is exactly what I wanted. Whereas a simple sine wave is teeny and annoying, when you pass it through this filtering, you get a richer, rounder, more bloated sound. Of course, when you look at the raw wave form, it still looks like a sine wave, but I suspect that it's slightly reshaped so that you get some interesting side frequencies going on. I tried doing a spectral analysis, but I couldn't really interpret it (it looked consistent with a sine wave to me), but it sounds a lot different, so there must be some sort of subtle colouring there.

Of course, to get really interesting sounds, you need adjustable filters that allow you to get various cool effects, but I think I have pieces of code for making blip and bop noises for my Java games.

Monday, July 25, 2005

Computer Generation of Road Networks

For the past couple of days, I've been thinking of how to get a computer to automatically generate a city road network. I really have no idea what the best way to approach this is.

Doing a quick search through the Internet, there are computer techniques for automatically generating terrain, for simulating the growth and sprawl of cities, for simulating the growth of small neighbourhoods, and for simulating traffic on a road network, but only a few simulations for generating accurate road networks themselves. In fact, the simulations that do exist seem to focus on simulating lot use with the expectation that the road network just sort of "falls out."

I think the difficulty is that there isn't that much scientific use for simulating the development of road networks, so a lot of research hasn't gone into that area. Of course, the people who do research on urban sprawl really should do more research into road networks too because urban sprawl tends to be dependent on the location of existing highways etc. Anyways, the problem is also hard because road networks tend to be highly planned and need to take a lot of factors such as geography, cost, etc. into account during their evolution, so it's hard to get a computer to randomly generate this type of data.

Still, I wonder what it takes to build a road network generator. I imagine that an urban growth simulation model might actually be the wrong approach because it'll end up being too complex. I think a top-down approach that lays out some highways, throws down a coast-hugging road network near water, builds a grid patterned downtown, and a squiggly suburb area might actually be the easiest approach. Hmm.

Friday, June 03, 2005

Theoretical Underpinnings for Kids' Programming Tools 2

Y'know, yesterday I was fretting about the fact that without strong theoretical underpinnings, my programming website for kids would never become a useful tool. But there no reason for me to have such a crisis of confidence because, honestly, which of these people is most likely to make the best programming tool for kids:

A PhD from MIT who has been professionaly developing his patented educational programs for more than 10 years.

A Turing award winner who has spent decades studying user interactions with programming languages.

Me.

Obviously, my approach will be the best. :-)

Teaching about Loops

So I finally finished most of the artwork for the "if" section of my programming website for kids. It seems that the Java VM has improved a bit since I last used it. I'm finally getting consistent synchronization behaviour between when my applet running in a web broswer and my applet running stand-alone. And Swing now seems to support automatic text cut&paste with Windows applications (sigh, all of this could have been avoided if they used native widgets!).

And I'm now starting the section on loops. I've spent a few months thinking about this topic, but I'm still not sure about what the best approach is.

In traditional programming books about BASIC, the loop section would talk about FOR...NEXT loops:

FOR n=1 to 5
REM Stuff that happens 5 times
NEXT n

Unfortunately, the Javascript for loop is a little bit cryptic (especially since I haven't covered increment operators, the less-than operator, or the greater-than operator). Additionally, the convention for loops is that they start at 0. But, people are more comfortable with loops that start incrementing at 1 (like in BASIC). However, if I use 1-based loops throughout my examples, people might get confused when they later start working with normal code.

I suppose I could treat the syntax of the "for" loop as gobbledygook that should be treated as a black-box. This then reduces the number of concepts that need to be understood to work with loops.

Another alternative is that I could start with while loops. The advantage is that the word "while" actually means something, so it might be easier to understand. Unfortunately, there isn't all that much that one can do with a while loop without understanding the concept of sentry variables, which involves teaching 2 or 3 abstract concepts all at the same time.

I could teach the concept of the infinite loops first via "while(true)," but assuming that I don't teach boolean algebra beforehand (which I don't), then the whole command would need to be treated as a black-box, defeating the whole purpose of starting with the "while" loop in the first place. Then, I could follow that by teaching the concept of using "break" to break out of loops. This seems a little odd to me because no one else teaches loops in this way. Perhaps the "breaking out of loops" concept is simply too abstract a concept. But it does seem like a lot less work than discussing sentry variables and such.

Well, I don't know. For the past few months, I've been keen on the idea of starting with the while loop, but now I'm starting to come around to the idea of starting with infinite for loops. I'll have to ponder this some more.

Thursday, June 02, 2005

Theoretical Underpinnings for Kids' Programming Tools

During the weekend, I happened to come across a cheap scanner for sale, and I grabbed one so I would have the tools to start working on my programmingbasics.org website again. This then inspired me to take another look at the tools available for teaching kids to program. It looks like the same stuff as always--Squeak, ToonTalk, and Logo. Of course, all of these tools have strong theoretical underpinnings describing interaction models, learning processes, etc., whereas my website simply focuses on recreating the programming environment available to me when I was a kid. I'm no longer confident that I'm taking the right approach any more. Will my website (assuming that I ever finish it) have the correct foundation for teaching kids to program? Should I take a more holistic approach to the site and try to provide a complete environment for "experimentation" as opposed to simply creating something that focuses solely on just programming? Hmm. Perhaps I'm being naive. Kids today associate computer with great graphics and sound. Creating an environment that focuses on programming and doesn't allow kids to immediately generate flashy animations might lose their interest too quickly.

When I learned programming, all computers were textual. In fact, I didn't even own a computer. I was fascinated by flowcharts and wrote my first computer program out on paper by hand. The books available for teaching kids to program focused on real languages and on real syntax issues. And that's what I wanted to learn when I read those books. That's why my website is based on Javascript and focuses on syntax. But maybe those old programming books for kids died out for a reason, and maybe that reason was that kids aren't interested in learning programming that way any more. I guess I'll have to think about it.

FBX File Format

Previously, I've complained about the U3D file format for having such an obscure encoding scheme that I felt it was impossible to read a U3D file without using the unreleased Intel U3D libraries that everyone else apparently uses to read and write this file format.

But after looking at some other 3d file formats, perhaps this isn't so unusual. For example, Kaydara's FBX file interchange format (now owned by Alias) doesn't even have a documented encoding format. As far as I can tell, the only way to read a FBX file is to use the provided SDK libraries. Personally, I can't imagine how you could call something an interchange file format if you don't actually describe what the file format is. But maybe I'm just not used to how things work with 3d graphics.

Oddly enough, the only light-weight 3d file format that seems to be well-documented is Microsoft's DirectX file format. Unfortunately, it's apparently no longer under active development.

Wednesday, April 27, 2005

Fedora Core 3 and User-Mode Linux

Previously, I made a Fedora Core 3 disk image for User-Mode Linux by simply copying the disk image of a running system into a loopback disk image and starting User-Mode Linux using that disk image. That works, but I later found that some of my services didn't start up correctly in UML.

It seems that Mysql doesn't work with the current User-Mode Linuxes based on 2.6 kernels. Apparently, UML doesn't implement some of the new threading features of 2.6 kernels, which causes various problems. To avoid that problem, you have to erase the libraries that use these new threading features. I found a forum that listed this fix

mv /lib/tls /lib/tls.backup

that prevents the libraries that make use of the new threading features from being found. While you're at it, it's probably also a good idea to disable the SELinux features because they haven't been designed to be easily configurable in Fedora Core 3 anyways. Just modify /etc/selinux/config so that SELINUX=disabled (instead of "enforcing").

I also had problems with postgresql, but that was because I forgot to make /tmp world-writable. Unfortunately, my postgresql didn't seem to log anything, so I couldn't really diagnose the problem. I tried to play with the boot commands a bit to get it to log something, but in the end, I found it was easiest to simply run the postgres postmaster from the command-line with

su postgres
postmaster -D /var/lib/pgsql/data

Currently, my Fedora Core 3 image for User-Mode Linux seems to be running fine. httpd does seem to crash the UML kernel when it exits, but I can live with that for now.

Sunday, April 17, 2005

U3D is Half-Baked

So during the weekend, I decided to try writing a U3D file viewer. I had high hopes for U3D because U3D seemed to avoid many of the weaknesses of the competing X3D specification. After working with U3D for a bit though, I've concluded that the U3D specification is half-baked. It's almost as if the 33 members of the 3D Industry Forum performed almost no due diligence in reviewing the specification. Personally, I think the spec is unworkable and unimplementable. This is a real shame because Adobe went and added U3D support to Acrobat, meaning that a lot of marketing muscle and mindshare has just been put behind a losing proposition. Even if a better spec were designed in the future, since it wouldn't be supported in Acrobat like U3D, it would likely fail.

I can ignore minor annoying aspects like the fact that the spec itself wasn't well-designed, treating things like arrays in several different ways throughout the spec and going out of its way to avoid simplifying the descriptions of the fields, and such.

But the killer of the spec is its ridiculously overbearing bit encoding scheme. I just cut & paste the code from the specification and translated it to Java, and it's just awful. The code uses something like over 30 lines of code to read an uncompressed 8-bit integer. And the encoding/compression scheme has some characteristics that simply ruin the rest of the specification.

For example, the compression scheme sometimes requires you to feed in parameters of live data structures in order to read subsequent data. This means that you can't simply read the data into data structures and then process it later. You literally have to create a live mesh data structure, manipulate the mesh as data is being read in, and feed the resulting properties of the mesh back into the compression scheme in order to read the next bit of data.

And the encoding scheme has this weird property whereby previous data values read can influence future data values read. So even though the file format is supposedly tag/block-based (meaning that you should be able to process only the types of data that you are interested in and skip the rest), you still need to read, process, and understand every single bit of data in a file in order to read later bits of the file because otherwise the decoding state won't be properly configured when you get there. This even holds true for uncompressed data, which is just wonky. I wouldn't have believed it if I couldn't see it right there in the code that I cut & paste from the spec.

After having spent something like 5+ hours on trying to read in a CLOD Progressive Mesh Continuation Block from a file and failing miserably, I can only conclude that it's simply not worth the trouble to implement. I am now very suspicious about why the 3DIF still hasn't released a reference encoder/decoder three months after the release of the specification. In any case, as far as I'm concerned, U3D is a write-off. It might gain a little bit of mindshare because of its Acrobat-support, but I think it will eventually die off because the only way to create U3D files is to use the rumoured Intel SDK (since it's essentially impossible to write one's own encoder), and most 3D tools developers will decide to ignore it.

Friday, April 15, 2005

Some First Impressions on the U3D Format

I haven't actually done any examining of U3D files yet, but I did a quick skim through of the specification, and I have a couple of first impressions.

The first thing I noticed was that they put a "file size" field in the file format. Personally, I think no self-respecting file format includes a "file size" field. The inclusion of such a field makes it impossible to stream data to stdout. Instead, you have to write out all the data, then rewind all the way back to the beginning of the file, just to fill-in that field. It would be different if the field served some sort of useful purpose, but I imagine all U3D file readers will just ignore it.

Then, the basic geometry type of the file format is the Continuous-Level-of-Detail (CLOD) triangle mesh. This is a little unusual because I thought the consensus in the 3d graphics community was that no one actually uses this stuff (of course, not being in the 3d graphics community myself, I can't be 100% sure, but that's the impression I get from reading papers). The main uses of CLOD meshes are for progressively refining 3d models during data streaming and for animation. I think the Internet experience has been that progressively refining anything during data streaming just isn't worth the trouble. For example, who uses the "interlace" option on their PNGs and GIFs? And properly getting the synchronization right on browsers so that they can handle the rendering of partially downloaded files is just a big mess. Just look at the old Java image APIs that supported these features: they were almost unbearable to use. It's easiest to simply use less-detailed thumbails/models. And I think CLOD meshes aren't used for animation either. If refined versions of CLOD meshes are computed by the microprocessor, then they have to be transferred to the video card for rendering, which is a waste of video bandwidth. It may be possible to do the refining of CLOD meshes on the video card, but I think no one does this because a) it's hard, b) CLOD meshes take up a lot of memory c) refining CLOD meshes aren't easily parallelizable and exhibit poor memory access behaviour d) you can get nicer looking low-polygon models by editing them by hand and using bump-maps and stuff. I suspect that the only reason CLOD meshes were included in the specification is that Intel wanted them there, but I would have thought that the rest of the 3DIF members would have convinced Intel to leave the feature out for the first version and only include it in version 2.0 (and then decide never to release a version 2.0 of the specification). I think it's possible not to use the CLOD mesh features though.

The U3D file format also uses a large variety of data types. I guess it isn't a real problem, but in this age where bandwidth is cheap (and things can be compressed fairly efficiently anyways), I think it makes more sense to have a couple of floating point types, an integer type, a byte type, and that's it. I haven't really examined what all the different types are used for though--there might be a good reason.

And the "U3D" way of doing things means that most resources are given string names. This seems a little inefficient (especially if you use something like UTF-16 encoding). It would seem to me that named resources would be more the exception than the rule, so it would make sense to have most resources be unnamed, and provide an optional naming interface for those resources that do, in fact, need names so that they can be accessed externally.

Also, the file format uses some weird lossless compression scheme. If the compression scheme were decent (like, say, the lossless compression that comes with jpg, mpg, and png), then I guess it might be worthwhile, but it looks like it's possible to get even better compression just by feeding the data into a zip compressor, so I'm surprised they even bothered.

I haven't really dug all that deep into the U3D file format yet though, so until then, I can't really say anything intelligent about the standard. But hopefully, I'll get around to doing that in the future.

Finding U3D Files

For the past few weeks, I've been wanting to play with the U3D file format a bit. Unfortunately, no actual U3D files are available on the web. There are U3D files embedded in pdf documents that can be viewed with Acrobat Reader, but if you want to look at an actual U3D file, you're out of luck. Intel apprently has a SDK for creating and reading U3D files, but they haven't released it to the public yet, so the most widespread way for people to get an actual working U3D file right now is to get a trial version of commercial 3d file format conversion software (which has Intel's U3D SDK embedded in it) and to convert some existing files over to U3D.

I tried e-mailing the 3DIF, asking them to put up a plain U3D file somewhere so that I could take a look at it, but they never bothered replying after a week. Fortunately, after a little bit of research, I found a utility called Multivalent. This utility comes with a tool called Uncompress, which is able to take a pdf document and uncompress all the files attached inside. Of course, afterwards, you still have a pdf document with U3D files embedded inside, but since the U3D data is now uncompressed, it's possible to examine the raw U3D data.

Since all U3D files start with the string "U3D\0" you can just discard all of the pdf data that comes before that string, and everything after that is a valid U3D file (I presume). I used a little Python script like this to discard the initial useless pdf data:

f=open('file.pdf')
data = f.read()
f.close()
# You need to take the second "U3D"
idx=data.index('U3D', data.index('U3D')+1)
data = data[idx:]
f=open('file.u3d')
f.write(data)
f.close()

Saturday, April 02, 2005

Maybe X3D Isn't Such a Bad Spec After All

After previously denouncing the X3D specification, I decided to take a look at the competing U3D specification. I skimmed through the specification trying to think about what would be required to implement a U3D reader (the spec mandates a binary encoding with special compression used) when I realized that there aren't any U3D files available for testing U3D implementations. Nowhere. Except when they're embedded in pdf documents. Perhaps one has to be a member of the 3dif to get access to them, but the problems I have with the X3D specification pale in comparison to the no-no of developing a specification that uses a complicated encoding without providing an example file anywhere. Well, I e-mailed the 3dif. Maybe I was just looking in the wrong places.

3D Tools for the Casual 3D Artist

Every few months, I get the urge to learn how to do rudimentary 3d art, so I do a quick tour of 3d tools available for casual users, and I always come back disappointed. For casual users, it's important that tools be cheap, not require users to have artistic talent in order to create simple objects, and be easy to use.

Immediately, the big, mainstream 3d modelling packages can be ruled out because they're too expensive. Of course, it's possible to use illegal copies of these programs, but I prefer not to do that. Then there are the mid-tier tools, but I'm always hesitant to shell out several hundred dollars for little-known tools from little-known companies that may or may not be any good.

This pretty much leaves low-cost and open source tools. The tool with the most mindshare currently is Blender. Unfortunately, Blender is a good example of a software application where little or no attention has been paid to the user-interface. If you read the development docs, this was an intentional decision on the part of the developers. For example, up until a couple of months ago, Blender proudly touted the fact that it had no undo functionality (telling you to save often instead). Personally, I'm tempted to use terms like "amateur-hour" here, but honestly, if I were a professional 3d artist who had just shelled out thousands of dollars for this tool, I would probably be ok with learning all of the weird UI quirks and hotkeys needed to be productive with the tool. Since I'm not a professional 3d artist and the tool is free, it's probably unfair of me to apply my standards on the tool. I have made a few honest attempts at trying to learn Blender, but all the hotkeys made my head spin and it seemed to crash quite often for me. Plus, based on the tutorials, it seems like users need powerful 3d visualisation skills to make effective use of Blender--skills that I don't possess. There's no way I can think in terms of "take a sphere, extrude this section while rotating it like so, perform a lathe on that object, and then edit the mesh." At best, I can look at a 3d object and play with the vertices a bit to get the shape that I'm looking for.

Milkshape3d is another popular 3d tool, and I've been tempted on many occasions to shell out the cash to go buy it. Milkshape3D is nice because it does only a few small things, but it does those things reasonably well. Unfortunately, every time I'm about to mail away my money to Switzerland, I start to think about the tedium of clicking in multiple windows everytime I want to add a single point to the model, and how I always get confused when I have to keep looking at different windows to know where I am, and I decide to wait for something better. I'm probably being hopelessly optimistic because manipulating 3d geometry on a 2d screen with a 2d input system (the mouse) is, by definition, hopelessly complex. But still, when I've used scaled back versions of professional tools (e.g. GMAX), they always provide nice facilities for manipulating polygon vertices without much hassle, so maybe that's what I'm looking for.

Art of Illusion seems to be reasonably powerful, but I think its 3d graphics isn't hardware accelerated, so it always seems a little sluggish to me. Plus, I can never figure out what the icons mean and the tooltips take too long to show up, so I spend much of my time holding my mouse pointer over icons to figure out which one does what I want.

Recently, I tried Wings 3D, and I really liked it. It's possible to perform lots of complex operations with complete ease. Pretty much all of the UI is self-explanatory and mouseable. The main advantage of the UI is that since everything is explained on-screen, you only have to memorize five things to be productive--namely what G, L, Q, space bar, and the middle mouse button do. With everything else, all you need is a general idea of what you want to do, and you'll probably be able to find it in the menus somewhere. The only problem with Wings 3d is that it doesn't work with many graphics cards. I tried it on an ATI Rage card, and an Intel integrated Extreme Graphics, and the UIs ended up being too horribly mangled to be usable. Finally, on an ATI 7500, the UI still didn't quite render correctly, but it was enough to be usable, and it was a joy to use. I was tempted to dig into the code and see if I could find any workarounds for my problems, but Wings 3D is coded in Erlang, so...yeah.

So in conclusion, if you want to dip your toe into 3d graphics on the cheap, try Wings 3D, and if it doesn't work, give up. Of course, I may be the only person who finds xfig to be an easy-to-learn program that one can quickly become productive in, so perhaps my experiences may differ from that of others.

Wednesday, March 30, 2005

Creating Fedora Core 3 Disk Images for User Mode Linux

So, for the past couple of weeks I've been working on creating a Fedora Core 3 disk image for UML. In the end, the process ended up being quite straight-forward, but in a case of "a little bit of knowledge can be a dangerous thing," I kept screwing it up.

Pretty much, all you have to do is to create an ext2 loopback disk image as suggested in the UML docs, and then copy a running fedora core system (using cp -r -d --parents --preserve=all or whatever) into it (obviously, you should avoid copying the disk image into itself). That's pretty much it. Labeling the disk image as /1 doesn't seem to be enough to get UML to mount the disk image properly as its root file system, so you do have to modify the /etc/fstab file so that the root file system is mapped to /dev/ubda (for some reason, the virtual drive appears as /dev/ubda instead of /dev/ubd/0 like in other images) using an ext2 format (unless you compiled ext3 support into the kernel). It's probably a good idea to enable networking in the kernel as well by turning on these kernel options:

UML Network Devices/Virtual network device
UML Network Devices/TUN/TAP transport

Then, you can just boot up UML into the image you created. You don't have to change the Kudzu configuration, change the networking to use something other than eth0, or anything. It just works. Below are the things I tried to do, which ended up being completely unnecessary:

Although Fedora Core 3 intially boots with a small initrd RAM disk image as its root file system before switching to using the hard disk as its root, this initrd is only for loading additional kernel modules. Since User Mode Linux has most options compiled directly into the kernel, it is not necessary to use an initrd image with UML. Also, it is not necessary to create ubd devices in /dev since Fedora seems to do that automatically. And hostfs mounting /sbin or /bin doesn't work out too well. You can mostly get away with hostfs mounting /usr but you have to prevent xfs from starting up (simply remove /etc/init.d/xfs) because I think it tries writing stuff into /usr, which won't work because on the host-side, UML won't have sufficient permissions to write there. I also didn't bother properly installing uml-tools, meaning that UML couldn't use the port-mapper to bring up the virtual console xterms. Since Fedora Core 3 requires you to log-in through a virtual console, you need to get the port-mapper working properly (I had to modify UML to find the port-mapper in my home directory instead of its default location). Also, Virtual Console #7 and #8 won't appear or won't show anything, so don't think the boot-up has failed if you don't see anything there. And that pretty much summarizes many, many days of mistakes.

Monday, March 21, 2005

Off-Screen Buffers in Java

I recently discovered that off-screen buffers in Java can be really slow. Off-screen buffers are commonly used in Java games so that a frame of animation can be constructed and then displayed on the screen without resulting in flicker. Since the Swing toolkit is double-buffered, most applications don't normally need to create their own off-screen buffers, but I don't like using the UI thread for game logic and rendering in my games (plus my games use AWT), so I have to manage the off-screen buffers myself.

Anyway, many graphics operations in Java are not hardware accelerated yet, and some of the buffer code that I'd been using for a long time has been using some really slow operations; I'd simply never noticed until I started working on a computer with a really old PCI Rage 128 graphics card. With such low bandwidth across the PCI bus, blitting the off-screen buffer images to the screen ended up being really, really slow.

The problem is that I'd created the BufferedImages backing the off-screen buffers as TYPE_INT_ARGB. I think the inclusion of an alpha channel, combined with an assumed lack of hardware acceleration, resulted in a lot of data needing to flow back and forth across the PCI bus unnecessarily. Once I switched the buffers to be TYPE_INT_RGB, everything was back to being snappy and fast.

Monday, February 21, 2005

Running User Mode Linux from 2.6.10 Kernel

I don't really know anything about mucking around with the Linux kernel (I'm the type of guy whose favourite text editor is pico and who can't figure out how to use Debian meaning he has to use RedHat instead), but somehow I signed up to do a term project involving a bit of kernel hacking, so I guess I have to learn. Since kernel stuff seems a bit messy, I opted to work with User Mode Linux (UML).

Now, when one browses around through the main website, there's a lot of stuff about ancient versions of UML based on 2.4 kernels, there's a couple of mentions of 2.6 kernels, and tucked away somewhere, there's a brief aside that UML has been integrated into Linux kernels 2.6.9 and beyond. Seeing as everything else on the main website seemed a little dated, I thought that it would be best to use the UML that's integrated into the main Linux branch because it would be more likely to be maintained and tested there. In actuality, the UML code that's been integrated into the kernel doesn't work all that well, and it took me a few days of searching through various websites, wikis, forum discussions, etc. to figure that out. So here are the goods on how I got UML working on my Fedora Core 3 system.

Apparently, the UML in the current 2.6.10 vanilla kernel doesn't work. Instead, grab the 2.6.9 vanilla kernel from kernel.org or somewhere. Then, go to Paolo Giarrusso's UML site and grab the latest 2.6.9 patch (it's in his archives in the guest patches section). After you've unzipped and patched everything, simply taking the default configuration from

make menuconfig ARCH=um

isn't sufficient. One has to enable these options in the kernel

Character Devices/File Description Channel Support
Character Devices/Port Channel Support
Character Devices/tty Channel Support
Character Devices/xterm Channel Support
Block Devices/Virtual Block Device
File Systems/Pseudo File Systems/dev File System Support

Then, you can follow the normal instructions about compiling, stashing a root_fs file system somewhere, and then using the devfs=(no)mount option appropriately.

This is enough to get UML to bring up a prompt, but that's as deep as I've dug so far. I'm still weighing whether it would have been easier simply to have started with a Debian package of UML.

Tuesday, February 08, 2005

CorelDraw 11 Import/Export Woes

Being from Ottawa, I'm a CorelDraw user (Corel is an Ottawa-based company). I'm not a very good CorelDraw user (I just use it to draw simple diagrams), but I'm still a CorelDraw user. Just like other CorelDraw users, I have to live with both the good and the bad features of CorelDraw. It comes with a lot of features and applications for a good price, but you have to live with the ever expanding resource usage of each version, buggy features, an under-designed user interface, and a general lack of polish. Perhaps things will get better now that they're under new management.

I bought a discount version of CorelDraw 11 a few months ago, and I use it for mostly computer science type things: presentations and LaTeX diagrams. In order to use CorelDraw in this sort of configuration, you need to do a lot of bulk importing and exporting. I just thought I would blog my approach to making CorelDraw 11 do what I want in case the one other person who uses CorelDraw in this sort of configuration is interested :-).

For example, I usually draw all my LaTeX figures in one CorelDraw document and then use a bulk export macro to export each page as a different eps file that I can reference in LaTeX. CorelDraw 11 doesn't let you configure eps export options from a macro, so you have to export one eps file by hand first and configure the export options there, and then when you run the macro, it will just reuse the export options that you set before.

Also, pdflatex prefers to import vector graphics in the pdf format. Unfortunately, CorelDraw 11's pdf exporter doesn't support bounding boxes, so the exported pdf files do not import correctly. I found this program called eps2pdf (I use the one that comes with a nice Windows GUI because I can't figure out how to invoke the preferred eps to pdf conversion utility "epstopdf") that can take eps files exported from CorelDraw and converts them into pdf documents that can be imported into pdflatex. The program chokes if the Corel-exported eps files have text converted to curves or if the Corel-exported eps files are too large. It's possible to invoke eps2pdf from the command-line to convert all eps files in a single directory, which is useful.

Just recently, I've tried to import a lot of xfig-created eps files into CorelDraw 11. This sounds a little odd, because it doesn't make sense to create diagrams in xfig when you own a copy of CorelDraw. The reason I needed to do this was that I needed to generate a lot of complicated machine-generated Postscript diagrams. I don't understand Postscript, but xfig generates very clean, easy to understand Postscript output. So when I'm in this situation, I draw a couple of simple primitives in xfig, export it as eps, then write a program that follows the template generated by xfig in creating its own Postscript diagrams. Unfortunately, Corel's eps importer imports the text in xfig eps files as curves instead of as text. I managed to side-step this problem by using ps2ai.bat to convert the eps Postscript files into ai Postscript files. CorelDraw 11 then imports the text in these ai files without problem. Unfortunately, somewhere along this conversion process, extra outlines get added to objects. It's possible to simply delete these outlines by hand (especially easy if you use the ObjectManager view), or you can import the xfig eps files directly, copy all of the non-text objects, and then use these objects to replace the non-text objects in the imported ai files. There's probably an easier way to do this, but I'm still trying to figure it out.

Friday, February 04, 2005

Extensibility in X3D

During my rant against X3D yesterday, I wanted to complain about the extensibility model of X3D as well, but I couldn't think of a better approach to supporting extensibility in a 3d file format, so I decided that it would be inappropriate to make a big deal about it.

Of course, I thought up of a better scheme last night, so now I feel justified in complaining about the approach used by X3D to support language extensions. Although I complained about how the X3D event model was too abstract to usefully express certain concepts, I find that the X3D scene graph hierarchy to be too concrete and specific to support extensibility in a graceful manner.

In X3D, scene graph nodes are well-defined types. Each node has specific fields and no others. To add support for different objects, one has to create a new scene graph node with the fields one needs (or insert a lot of meta data, but I'll discuss that later). If an X3D browser encounters a node type that it doesn't understand in an X3D file, probably the best it can do is to simply ignore that node. Unfortunately, this doesn't degrade gracefully. So, for example, consider the mesh objects that are currently in X3D. If I want to make an articulated mesh, I need a way to add weights to each of the mesh vertices describing the influence of various bones on the position of the mesh point. To do this, I need to make a new scene graph node for articulated mesh objects that has a field for holding the bone weights. So now, if an X3D browser encounters my new articulated mesh node type in a file, it won't know what to do and will simply not render it. Ideally, the browser should simply degrade gracefully and render the data as a normal mesh, but it's not possible because there's no way for the browser to know that there's a link between an articulated mesh and a normal X3D mesh. Over time, as more and more node types are added to the standard, the standard becomes so big that no browser is able to implement the whole specification, resulting in many X3D files that are simply incompatible with many X3D viewers. In fact, this can already be seen in the current standard which has different node types for IndexedFaceSets, IndexedTriangleSets, IndexedTriangleFanSets, and IndexedTriangleStripSet. All of these node types are variations on the same theme, but an X3D browser must be explicitly coded to handle all four types separately.

One way around this problem is to use the metadata facilities of X3D. So an articulated mesh could be stored as a normal mesh with all the bone weights stored as metadata in the mesh. A browser that doesn't understand the metadata can just render the mesh as a normal mesh, and a more advanced browser can interpret the metadata and extract the bone weight information. Similarly, all the different triangle sets could be encoded as IndexedFaceSets with metadata suggesting the optimisation of rendering the node as triangles. And therein lies the way to gracefully supporting extensibility in X3D. There shouldn't be set node fields in X3D. Instead, all fields should be metadata. Most 3d objects can be described as a control surface with various extra descriptive data thrown in. As such, X3D should simply abstract all 3d objects to being a base control surface type, and all the extra descriptive data about normals, colours, and shape, etc. should just be metadata. So a cone is a box node that is tagged as a "cone" in its metadata. A height map is just a mesh node that is tagged as being a "height map" with some metadata describing the orientation of the heights. It may not be the most efficient way of storing data, but the benefits in terms of gracefully supporting extensibility is worth it. A minimal X3D browser then only needs to be aware of a small number of abstract node types. Even on very complex 3d models, an X3D browser will always be able to extract something that it can render.

Instead, X3D simply took the flawed VRML model and recoded it in XML. I guess we can always wait until X3D 2.0.

Thursday, February 03, 2005

U3D vs. X3D

I was looking at the website of the purveyor of the X3D format yesterday, and I noticed that they had a newspost slamming the rival 3D format U3D there. I haven't read the U3D spec yet, but based on the newspost, it sounds pretty good. In fact, I think that if the U3D spec had been available when I started my X3D project, I would have used U3D instead.

The newspost complains about how X3D only supports triangle meshes and the like. Honestly though, having just finished implementing a horrible n^4 concave polygon triangulation algorithm for my X3D viewer, I'm liking the idea of a format only supporting triangle meshes more and more.

I've only implemented a small amount of the X3D spec, and I can't help but feel that it lacks a certain elegance and simplicity in its design. I think many of the problems evolved out of the fact that the designers wanted people to be able to code X3D by hand. As such, the spec supports lots of shortcuts and features to aid people coding up X3D manually such as the ability to leave out certain tags or to let the browser automatically calculate normals etc. These sorts of things simply make the implementation more complicated and results in lots of "special cases" in the specification. In reality, no one designs a 3d model using text (believe me, I tried this once. It's totally hopeless), so it would have been better to leave out stuff like that.

I'm not implementing the X3D event model, so I'm not too familiar with it, but my gut feeling is that it is likely too expressive. The Postscript language has a similar problem in that it is a full programming language, meaning that extracting meaning from the language is extremely difficult without actually executing it. For example, let's say I wanted to export an animation to another program. The exported animation will have a clock object whose timing is fed into some sort of coordinate generator whose events would be fed into the actual object being animated or something like that. Would an X3D importer be able to interpret the combination of event objects and event routing as being an animation and to import it as such? Or would it simply have to import these event nodes as is and leave it to the user to figure out that it represents an animation? Just as human language is too expressive for computers to understand, which is why we have programming languages, the X3D model might be too expressive for an import tool to recognize the patterns, which is why more restrictive languages are often more useful. If a language is too expressive for an import tool to understand its meaning, resulting in it being imported as a black box, then the language might as well be some standard language like ECMAScript that people are already familiar with.

And of course, there's my bias against "platforms." I feel that the best specifications are for little self-contained toolkits that can be bolted on to existing applications. X3D was designed as a complete platform for 3D browsing, so it has a tendency to want to take over your application as opposed to being a simple bolt-in. For example, to implement a module for importing animation, you need to add the event model to your application, some sort of reflection mechanism for X3D objects to parse the event code, the actual X3D objects themselves, etc. etc. Afterwards, it's no longer a simple little import tool; you've essentially just written an X3D browser. At that point, you might as well just write your whole application to be X3D-based.

X3D simply tries to do too much and as a result is too complex. U3D focuses on one small aspect of 3D file exchange and hopefully ends up being small, graceful, and easy to use. It's somewhat like the TIFF file format which is so complicated and supports so many features that no one really uses it for anything. It's a coin toss as to whether an arbitrary TIFF file might be importable by a TIFF import tool or not: there might be multiple pages in there, encoded in some weird colour space, compressed using some unknown scheme, etc. With PNG, you're pretty much guaranteed success.

Saturday, January 22, 2005

JSR-184

After a break of a few weeks, I went back to X3D coding, and as a result, I now have a working X3D parser, and I've implemented just enough of the X3D spec that one can look at boxes and set a viewpoint.

I started to become a little worried after I noticed that I was approaching 6-7k of code just to draw a little box, so I decided to port what I had to J2ME to make sure everything worked. Good thing I started early because if I had left the port until the last moment, I surely would have had a seizure or something.

Apparently, vertices and normals are defined using bytes and shorts in JSR-184 (the current Java specification for doing 3-D graphics for cellphones). Yes, I am aware that cellphones do not have floating point units, making floating point math very slow, but bytes and shorts? This makes the porting of any graphics code from desktops to cellphones mind-numbingly difficult. Everything has to be torn out and carefully redesigned so that they work correctly with byte and short values for vertices. And such a design has a shelf-life of about two years because if 3-D graphics ever take off cellphones, floating-point units will be added to the devices, and developers will demand a new API with better floating-point support. Sure, I'm glad that Nokia is very proactive about defining new Java API standards for cellphones; otherwise, the latest cellphone features simply wouldn't be accessible from Java (or, at least, not in a standardized way). But I really do wish that Nokia would spend a little bit more time polishing their APIs before releasing them, so that the APIs wouldn't fall into obsolescence so quickly, necessitating the creation of yet another unpolished API to replace it, which'll become obsolete again in just a few years etc. etc.

Would it have really been so hard to allow vertices and normals to be defined using floating point values? Sure, it would be slow, but it would be nice to have that option at least. And adding that support would add another year or two to the life of the spec, at a minimum. I might expect this sort of lack of foresight from a bunch of amateur hacker coders, but I imagine that Nokia probably has some good people on this. I think it must be part of a larger corporate strategy of planned obsolescence in software. Because the old APIs are so limited and restrictive, developers will have to keep on moving on to newer APIs to make use of newer features. Since the new software is incompatible with older phones because the software makes use of the newer APIs, people have to upgrade their phones to use the newest software. And, if the new phones don't even bother to include the old APIs, then users can't just copy software from their old phone to their new phone; they have to buy all new software for each of their phones. This is good for the corporate establishment not only for the above reasons but also because it hinders piracy and allows vendors to control the availability of software on different platforms. All of this, of course, is meant to oppress the consumer and to wring as much profit as possible out of the pockets of the average worker, the elderly, and teenagers around the world.

Or, maybe Nokia just hired a bunch of amateur hacker coders.

JOGL and Resizing Windows Part 2

Yeah, apparently the problem I was having before was simply a driver problem. Of course, Intel hasn't updated their drivers in a while, so I guess I'll have to live with it.