Ming's Coding Blog: 2012

Monday, October 01, 2012

Scarborough Subway Plans

During the last few months, talk about building a subway in Scarborough has been cropping up again. Personally, I find this a little worrisome. Although my usual philosophy towards transit is that is that building anything is better than building nothing, building a subway is Scarborough just seems like a big waste of money that will sap money away from more important projects. Subways exist because they provide capacity and speed for transit users. There is no evidence that Scarborough is in need of either. If there were capacity problems, the TTC would be running high capacity buses in dedicated bus lanes, but they're only starting to consider deploying high capacity buses now. If speed were a problem, the TTC would offer a highway 401 bus or have special buses to redirect people to GO train stations, but they aren't. The reality is that suburbs are designed for cars and building a mass transit system there for moving hundreds of thousands of people would be a waste of money. In order to achieve increased capacity and speed in a transit system, you need space. In high density areas, there's no space, so you need to build subways. In the suburbs, there's plenty of space, so there's no need to build subways to achieve capacity and speed goals. For the price of a subway, you could probably just buy out whole suburban neighbourhoods, plow them under, and build a fast surface bus or train instead.

But to make sure I wasn't judging anything unfairly though, I decided that maybe it would be good idea to run some simulations to see what the effect of different subways plans would be. Following the same experimental methodology from my previous simulations, I ran some simulations of the effect of different plans on transit times to downtown. I believe that all of these plans would end up costing around $1bn-$2bn. Following the METROLINX guides, surface LRTs are modeled as having an average speed of 22km/h while grade-separated LRTs and subways are modeled as having an average speed of 32km/h. The future Eglinton LRT is included in the simulations even though it does not offer any benefits to people in Scarborough going downtown (though it would useful for people going to midtown).

Current System

As a baseline, here is a map of transit times of the current transit system in Scarborough. This system is actually unsustainable because the Scarborough RT line from Kennedy station to Scarborough Town Centre is obsolete, over-capacity, and needs replacement.

Transit times to downtown.
Data, imagery and map information provided by MapQuest,
Open Street Map and contributors, CC-BY-SA.

From the map, it can be seen that most transit users in Scarborough need about an hour to reach downtown. Most transit users don't live near a transit line, so they need to take a bus to the Scarborough RT line or to the Yonge line in order to reach downtown (people north of the 401 who don't live on Sheppard generally don't take the Sheppard subway because it only comes once every 7 minutes, so it's faster and more convenient for people to take an express bus direct to the Yonge line). Scarborough is laid out as a grid of major roads. Inside each grid square is a tangle of residential roads that are generally impassable by buses. As such, a transit user must walk to the nearest major grid road and take a bus from there. I suspect that this is part of the reason for the short distance between bus stops (as compared to suburban bus systems in other cities). Although this results in slower buses, transit users already have to walk some distance to reach a major road where buses run, and it would be unfair if they have to walk another half kilometre along the road to reach a bus stop. Buses in Scarborough generally run along these major roads, forming a grid pattern. One exception to this pattern is that many buses are rerouted to Scarborough Town Centre, which is intended as a transit hub for Scarborough.

Sheppard LRT with SRT Revamp and Extension

The default transit extension that will be built by the Ontario government if there is no political interference is an LRT along Sheppard and an extension of the SRT to Sheppard and Markham. The Sheppard LRT is modeled as coming with a six minute frequency, similar to the frequency suggested by METROLINX for the Eglinton LRT. The frequency of the SRT was not adjusted (I think it has a 3-4 minute frequency during rush hour) even though it will likely come less frequently after the revamp because newer high capacity trains will be used on the line after the change.

Transit times to downtown with a Sheppard LRT and SRT extension.
Data, imagery and map information provided by MapQuest,
Open Street Map and contributors, CC-BY-SA.

The Sheppard LRT does not provide any benefit over the existing system. Current Rocket express buses already reach an average speed of around 20km/h, which is similar to the speeds expected to be achieved by the LRT. The Rocket express buses also come more frequently than the LRT, so the LRT won't provide any benefit over the existing bus system in terms of transit times. The LRT will likely be cheaper to run than the bus system though. The SRT extension does seem to save transit users living on Markham south of the 401 about 10 minutes. People who live along Sheppard in Malvern will also see savings of about 5 minutes in transit times. These time savings might disappear though if the revamped SRT is run at a lesser frequency than it currently does since all the time savings of having a faster mode of transport will be lost by having to wait longer for the train to arrive.

Sheppard Subway Extension to Victoria Park

Historically, the long-term plan for a subway in Scarborough is an extension of the Sheppard subway to the Scarborough Town Centre. This would be prohibitively expensive, so current suggestions are that an extension of the Sheppard subway to Victoria Park could be used as an intermediate step to demonstrate the benefits of a full subway.

Transit times to downtown with a Sheppard subway extension to Victoria Park.
Data, imagery and map information provided by MapQuest,
Open Street Map and contributors, CC-BY-SA.

A subway extension to Victoria Park provides a ten minute in transit times for people who live at Victoria Park. People who live on Victoria Park near Sheppard will see a 5-8 minute improvement since they don't have to change buses at Sheppard in order to get to the Sheppard subway at Don Mills. People living on Sheppard west of Agincourt will see a modest improvement in transit times since they can transfer from a bus to a subway at Victoria Park instead of having to cross the 404.

OneCity Subway

Recently, there has been a suggestion that the SRT should be scrapped and the Bloor subway should be extended to Scarborough Town Centre and then up to Sheppard. This will supposedly result in faster travel through Scarborough because no transfer from the SRT to the subway will be needed at Kennedy subway station.

Transit times to downtown with the OneCity extension of the Bloor subway to Sheppard.
Data, imagery and map information provided by MapQuest,
Open Street Map and contributors, CC-BY-SA.

Eliminating the SRT results in a sharp reduction in service in the area around the SRT. There is also a surprising increase in transit times for people living in Agincourt since they no longer have a short bus ride to the SRT. Instead most of the transit benefits of having a nearby transit line are pushed east to the Scarborough hospital and Malvern West.

"Subway" to Pacific Mall

Out of curiosity, I decided to model the effect of scrapping the SRT and building an LRT along the train corridor from Kennedy to Pacific Mall in the north. One problem with building rapid transit lines in Scarborough is that the main transit hub for the area, Scarborough Town Centre, is in the middle of nowhere. It is not near the major transit corridors of Scarborough, so many buses actually have to be redirected from their natural routes in order to reach there. The only major transportation corridor that the Scarborough Town Centre is near is Highway 401. As can be seen from the 2006 strategic plan for the SRT, 75% of SRT users ride a bus to reach it with users mainly coming from Malvern, Malvern West, and the Milliken area. So instead of designing a squiggly line of a system, I decided to see the effect of building the longest, cheapest, grade-separated "subway" possible that would intersect as many existing bus lines as possible--i.e. an LRT along the current train corridor from Kennedy to Pacific Mall. Such a line has more potential for provincial and federal government support since it serves multiple regions (Markham and Scarborough) at a modest cost, can be easily extended later on to the Highway 7 BRT, and much of the electrical infrastructure will be needed in the future anyway for the GO train network. This LRT was modeled as having a frequency of six minutes. Unfortunately, it's not really possible to model the effect of such a transit line accurately since many of the current bus lines are designed to send people to Scarborough Town Centre; whereas, in this system, buses would mainly run east-west so as to dump people off at the rapid transit line running through the middle of Scarborough.

Transit times to downtown with an LRT along the train corridor to Pacific Mall.
Data, imagery and map information provided by MapQuest,
Open Street Map and contributors, CC-BY-SA.

Unfortunately, the simulations show only modest benefits from such a system. Near Sheppard, the train corridor is a little too far from Kennedy and Midland to make walking there worthwhile as compared to taking a bus down to the SRT. At a frequency of every six minutes, transfers to the line take too long, and transit users will often find it easier to simply take a bus directly to the Yonge subway. Arguably, the same thing will happen in a revamped SRT if it has a six minute frequency as well though. I think this scenario is too difficult to simulate using my current approach, so it's impossible to compare it with other approaches.

Conclusion

The pressure to build a subway in Scarborough seems to be based primarily on politics than on a genuine need. They feel that they deserve a transit showpiece comparable to what's available in northern and south-western Toronto. This is not a bad thing, but it means that potential transit plans should be evaluated primarily in terms of minimizing construction and operating costs. The cheapest transit line that satisfies the political needs without damaging the existing transit system too much.

For this reason, I actually think the Sheppard LRT and SRT extension project is probably best. Surprisingly, this project will actually result in worse transit times for people living in Scarborough because the increased capacity of the new lines will mean that transit will likely be less frequent than the current system. The project will likely have lower operating costs than the current system, which is the main benefit of building it. The grid spacing of Scarborough also seems to favour east-west transit lines over north-south transit lines because it's often a shorter walk to get to a north-south bus line (which can then take you to an east-west rapid transit line) than to get to an east-west bus line. To save even more money on construction costs, it might even make sense to abandon the Sheppard LRT and only build the SRT extension since the Sheppard LRT doesn't actually do anything interesting from a transit perspective.

Of the two plans that actually involve building real subways--the OneCity plan and the Victora Park extension--I actually prefer the Victoria Park extension. The OneCity plan is more expensive and the expense mostly goes towards shifting an existing transit corridor 2km to the east for no particular benefit. Overall though, I think both plans are bad because they are expensive, unnecessary, and will likely increase operating costs for the TTC. In any case, neither of these plans are feasible because neither the city nor the province have the means to pay for it. The province has run out of money, and my understanding is that they can only find money for new construction through the use of accounting tricks. They are mainly interested in building self-contained new transit lines that can be leased to private industry or mortgaged, thereby allowing the province to hide the cost of the construction by spreading it over many decades instead paying for construction upfront or something complicated like that. Subway extensions aren't self-contained, so they can't be owned and operated independently by private companies, so they don't qualify for accounting trickery, so the province can't fund them.

I'm still partial to my idea of building a "subway" to Pacific Mall. It's a big transit showpiece that can be built at a modest cost, the surrounding area are likely good candidates for densification, and I believe it would be eligible for provincial funding. Unfortunately, it's hard to simulate the effect of such a massive change to transit, so its effects on transit usage is unclear. It might be a little bit too close to the Yonge subway line, so people might still prefer taking a bus to Yonge and riding the subway downtown as opposed to taking a bus to the train corridor, taking a train to Kennedy, taking a train to Bloor, and then taking the Yonge subway line down into downtown. Also, the abandoning of rapid transit to Scarborough Town Centre is likely too politically controversial for politicians.

Tuesday, August 21, 2012

JavaScript is Over: We Just Don't Know What to Replace It With Yet

Although other people (namely Google) have been advocating this for a while, I've only come around to this way of thinking recently, but I think consensus is gradually building in the community that JavaScript is over. In the early days when people were doing simple little scripting with JavaScript, it was a fine language. But as people start writing larger and larger programs with the language, it's showing its weaknesses, and it's becoming clear that JavaScript simply can't handle programming in the large. In the past, there was some talk that software engineering features could be grafted on to JavaScript (the infamous ECMAScript4), but I think researchers are now beginning to realize that the JavaScript model is fundamentally too messy to make such an effort worthwhile. Just as people don't write databases using bash scripts, you just can't have a large million line program developed by hundreds of people written in JavaScript. It's a maintenance nightmare. Even the most radical of the proposed reforms for JavaScript would only improve the situation in minor ways.

The problem is that JavaScript is simply too dynamic. There are static languages, there are dynamic languages, and then there's JavaScript. Once you get to hundreds of thousands of lines of code written by a team of hundreds of people, it's no longer possible for any one person to understand the entire code base. An individual programmer will at best understand the general shape and structure of the code base and understand the details of only the tiny section of the code base that he or she is working on. As such, they are almost entirely dependent on programming tools to help them work with the code. Since they can't understand the interdependencies of the codebase, they need tools to tell them whether changes will likely result in errors or whether a refactoring is safe or whether certain assumptions about how the code is structured is actually true or not. With statically-typed languages like Java, C#, and C++, many tools exist for finding errors in code and performing refactorings and visualizations. Even with older dynamic languages like LISP or Smalltalk, these sorts of tools existed and were quite effective. JavaScript, though, is part of the modern wave of dyanamic languages that have been designed to be flexible and stuffed with features so as to accomodate a wide variety of programming styles. Unfortunately, this flexibility makes building good JavaScript tools extremely difficult. And without these tools, it's not possible to scale up a language for use with large code bases.

The heart of the problem is that JavaScript is too flexible and too dynamic. Duck-typing can be used to change any object from one type to another at runtime. There are no guarantees about which methods can be called when. There are no constraints as to what can happen within functions or not. In past dynamic languages, there have been constraints on the programming paradigms that allowed programmers and their tools to make assumptions about the behaviour of programs. For example, LISP is considered a very dynamic language, but since it's designed around the functional programming model, functions do not have any mutable state, which makes it easier for programmers and tools to understand LISP code. Similarly, Smalltalk, an early dynamic object-oriented language, did not allow methods to change during runtime, an object's data could only be accessed by the object itself, and most method calls occurred through a well-defined mechanism, making code analysis much simpler. JavaScript, by contrast, supports both functional and object-oriented programming, but has none of the accompanying constraints of these older languages. Without these constraints, it's very hard for programmers or their tools to understand how code works. Even simple queries about where a method is used or what methods are available on an object require complicated pointer analysis to solve, if they are even solvable. I think that the key indicator as to whether a programming language can scale to million line programs is whether there are any reliable refactoring tools available for that language. If no such tools exist, then the language is likely too wild for any programmer or their tools to be able to work with a million lines of such code.

In the past, it was thought that JavaScript would be salvagable by adding new features to the language such as modules, sealed objects, types, etc. I now realize that the problem is that these new features are additive. They add new capabilities to the language; they don't impose any additional constraints on how programs are structured. As such, the problem of analysing and understanding million line JavaScript programs becomes harder with these new features, not easier. Types do not help refactoring tools unless types are mandatory. Unless all objects are sealed, it doesn't help a programmer understand whether an object they are using has the correct interface or not at a certain point in the code. These new features do not add new constraints that reduce the search space that a programming tool needs to analyse to understand a piece of code.

Better programming practices might help a bit. But it's cumbersome and error-prone. For example, an extensive set of unit tests can provide similar error-checking as static type-checking. But the effort required to maintain these unit tests is high, and if any area of the code is missing unit tests, then programmers and their tools can no longer rely on unit tests to check their errors.

Flexibility and expressiveness are wonderful features in small languages, but they cause maintenance nightmares when these languages are scaled to giant million line programs. JavaScript is a very expressive and powerful language, but these same characteristics will prevent it from scaling to the sizes that programmers want. Programmers are beginning to realize this. Unless there's some great advance in programmer tooling that can handle the complexity and flexibility of JavaScript, programmers will have no choice but to start using alternate programming languages. These languages will have to compile down to JavaScript so that they can be deployed in web browsers. JavaScript then becomes only a deployment language or an intermediate representation. The real development is done with another language. It's not clear which of these alternate languages will win out, be it C#, Java, Dart, or perhaps some new language in the future, but I think this will be the future trend. It's probably best if new language features aren't added to future versions of JavaScript, but only new engine features or speed improvements. For example, if everyone is coding in alternate languages, then no one will use new syntactic sugar, but features like weak hash maps will greatly augment the types of alternate languages that can be built on top of JavaScript.

Update 2012 Oct 29

So I've been thinking over the problems with JavaScript as a programming language, and these are some things that I think will likely be changed in any JavaScript replacement.

to be removed: Objects as Associative Arrays
to be replaced by: Dictionary Objects, Weak Hash Maps
Associative arrays are extremely useful when programming. But having every object be an associative array is just too much. If you need an associative array, you should create an AssociativeArray object. There is no need to burden every single object with associative array capabilities. Associative arrays are widely used to store supplementary information in the DOM though, so some sort of weak map functionality will also need to be provided to replace this.

to be removed: Prototype Objects
to be replaced by: Class-based Objects
unsure: Methods are Just Functions Inside Objects
If objects are no longer associative arrays, it weakens the need for prototype based inheritance. Class based object systems are much more well-studied and understood than prototype based systems, so moving to such a system should help the building of new tooling significantly. The fact that JavaScript methods are just data members that point to function objects may or may not be a barrier preventing a move to a class-based system though. It might be necessary to have separate notions for function and method.

to be removed: Object Literal Syntax
to be replaced by: Anonymous Classes
JavaScript's object literal syntax is a great, easy way to make small, ad hoc, short-lived objects. But once you move to a class-based system, the current behavior of object literals won't work. They'll probably have to be adapted to some sort of anonymous class system.

unsure: Automatic Casts, Automatic Semi-Colon Insertion, Val Scoping, Property Accessors, Variable Function Parameters
JavaScript has a bunch of other features that have long been troublesome such as the ones above. But I'm not sure if these features actually block the deep analysis of JavaScript code, so it might be possible to keep them if people actually want them.

to be added: Modules, More Declarative Syntax
JavaScript currently doesn't have a good mechanism for breaking code into independent chunks. A good module system should fix that. Continuing with that direction, it would also be useful to provide a more declarative syntax for the language because it would allow code and data structures to exist and be analysed independently of other neighboring code.

Thursday, August 16, 2012

Population Density Map for Toronto by Neighbourhood, 2011

Although the City of Toronto publishes all sorts of weird maps of Toronto, they haven't published a population density map using the most recent 2011 census data yet. They have published all sorts of related maps like population growth rate maps, senior maps, and children maps, but no matter how deeply I dig through their demographics website, I can't find a population density map using the 2011 data. But the City did publish the population of each of the 140 neighbourhoods in Toronto, so I thought I'd just generate my own population density map. I also grabbed Toronto's Open Data map information about the shape of each neighbourhood. So, all I needed to do was to parse the neighbourhood data and output it as svg. The shape data used a universal transverse mercator coordinate system, which means I could just calculate the areas of the shape polygons directly without really worrying about the curvature of the Earth and still get reasonable numbers. And from there, it was easy enough to calculate the population density.

Number of people per square kilometre (Contains public sector Datasets made available under the City of Toronto's Open Data Licence v2.0. Users are forbidden to copy this material and/or redisseminate the data, in an original or modified form, for commercial purposes, without the express permission of Statistics Canada. Information on the availability of the wide range of data from Statistics Canada can be obtained from Statistics Canada's Regional Offices, its World Wide Web site at: www.statcan.ca and its toll-free access number 1-800-263-1136.)

I sketched in the approximate locations of the subway (plus future extensions) so as to give the map some identifiable landmarks. The colour range is linear except for St. James Town, which had such a high population density of over 40k/km^2 that it skewed the colour range too much, so I had to treat it specially.

From looking at the City of Toronto's other maps, it looks like they have population data at a much more fine-grained level available, but I was restricted to individual neighbourhoods, which can be quite big at times and which sometimes encompasses mixes of large apartment blocks as well as low-density housing. Also, the data only shows residential density. The City has employment survey data which shows all the employment centres in the Toronto, but they don't have it in a form that I could process and overlay on top of this map.

The map is sort of limited, so it's not too interesting. I'm not sure if any interesting conclusions can be drawn from it. I guess I was previously curious as to why so much attention was being dedicated to improving transit in Scarborough in the east but not to Etobicoke in the west. But from the map, I can now see that the east end is much bigger with a higher population density, so it probably makes sense to focus efforts there. From the map, the case for a downtown relief line that extends north of Bloor in the east doesn't seem to strong since the density doesn't seem that great as compared to the west side, but the Don Valley could be screwing up the density numbers, and the density numbers don't include information about workplaces.

Tuesday, July 31, 2012

Switch Performance in JavaScript

Recently, I've been running the Mozilla Rhino JavaScript interpreter as a JavaScript program in a web browser, and the performance has been awful. I haven't really dug too deeply into the performance yet, but one thing I've been curious about is the effect of the giant switch statement in the middle of the interpreter.

When writing a bytecode interpreter in C or C++, the switch statement is usually the way to go because you can rely on the compiler to use perfect hashing to make the best lookup tables, and switches in general are very light-weight structures that have performance similar to hand-coding a bunch of goto instructions. Since JavaScript optimizations are done at runtime, it's possible that using a giant switch statement with multiple cases handling the different bytecode instructions might not be optimized correctly.

switch(instruction) { case jump: ... case add: ...}

So, to test things out, I created a bunch of different test cases inside jsperf, and I tested them out. Here are the results:

The larger the number, the better. The bars represent:

1. dense array
2. dense switch
3. giant sparse array with negative values
4. giant switch with negative values and common paths at the end of the switch
5. giant switch with negative values
6. giant all positive sparse array
7. giant all positive sparse switch and common paths at the end of the switch
8. sparse array with negative values (and lots of lookups of negative values)
9. switch with sparse negative values (and lots of lookups of negative values)
10. sparse array
11. sparse switch

It's a little bit worrisome that #4 has worse performance than #5 for all the browsers. The difference between the two is that you have a giant switch where usually only 16 cases of the switch are used. In #5, those 16 cases are at the head of the switch while in #4, those 16 cases come at the end of the switch. If the JavaScript engine were using lookup tables properly, then those two cases should have similar performance. Instead, #4 has worse performance than #5, suggesting that the JavaScript engines are actually encoding the switch statement as a series of if..else blocks, which can be slow if you have a lot of cases (which is definitely the case in a bytecode interpreter).

Chrome seems like it doesn't like it when you do lookups of negative values in arrays. It's possible that Chrome is using strings for negative numbers in arrays since the performance is so poor.

Overall, the numbers seem to suggest to me that if you've got hundreds of cases inside a switch statement, then using a lookup table of functions would probably be a better design though you've got to be careful not to use negative indices.

Wednesday, March 14, 2012

Eglinton + Finch LRT vs. Eglinton Subway

Now that I've figured out how to use the TTC schedule data to build maps of transit times in Toronto, I decided that it might be interesting to use my map generator to analyze some of the proposed transit plans for Toronto. Right now there's a lot of controversy over whether the city should build one long subway under Eglinton or build a shorter subway on Eglinton with the savings going towards an LRT on Finch. By generating different maps for these two different transit plans, I could then see for myself the effect of these plans on transit times.

Methodology

I've already described in detail in previous posts how I generated my maps based on the TTC schedules. Simulating the effect of the new transit plans simply involve adding in new fake schedules for the new transit lines. I could create these fake schedules based on the descriptions of stops, headways, and vehicle speeds given in various TTC and Metrolinx reports (many of which I found by digging through Steve Munro's blog). One issue I purposely avoided dealing with though is the fact that existing bus lines are modified whenever new transit lines are introduced. For example, when a new subway is introduced, bus service in the area is drastically reduced because most people will take the subway and the TTC needs the cost savings from eliminating bus service to pay for the operation of the subway. The bus service that is maintained is usually redesigned to funnel as many people as possible to the subway instead of trying to take people directly to their destinations. I considered the possibility of building up new fake routes and schedules for buses near the new transit lines, but it seemed time-consuming and error-prone. Instead, I'll just rely on the reader to use their judgment when interpreting the maps. The maps only show a broad approximation of what might occur if the new transit lines are introduced.

Baseline

Both of these plans will probably take 10 years to finish, so the Spadina subway extension will be done long before these plans will be implemented. Since the Finch LRT is designed to link up with the Spadina extension, it's important to model the Spadina extension in the experiment.

I went to the TTC website for the Spadina subway extension and figured out the approximate positions of the new subway stations. I wasn't sure if I put the stations in the right spots (i.e. beside existing bus stops so that transfers are easy), so to be safe, I changed the simulation to allow people to walk 200m between stops (unlike the 100m limit that I used previously). The different stations were about 1-1.5km apart from each other. Elsewhere on the Yonge-University-Spadina subway line, the subway seemed to be able to travel about 1.2km in about 2 minutes and 15 seconds, so I used that as the time needed to travel between each station on the new extension. I went through the existing schedules for the YUS-subway. Whenever a subway ended its route or began its route at Downsview station, I extended its schedule with stops along the Spadina extension.

Here is a map of the transit times to downtown including the Spadina subway extension:

Transit times to downtown with the Spadina extension
Data, imagery and map information provided by MapQuest,
Open Street Map and contributors, CC-BY-SA.

This map serves as a baseline that we'll add the different Eglinton transit plans to. Remember that this is a map of the travel times needed to get to Queen or Osgoode stations. It shows an average of the latest times you can leave in order to reach one of those stations before 8:55 and 9:00.

Eglinton Subway

Although I was able to find general information about the Eglinton subway, I have problems finding detailed reports about its route. I was able to find information about all the stops for the Eglinton LRT, so I simply took this list of stops and pruned out those stops that were too close together and hence unlikely to be made into a subway station. The information I could gather from various Metrolinx presentations suggested that the Eglinton subway would go along Eglinton from Black Creek to Kennedy and then follow the route of the Scarborough RT to McCowan. The average speed for route was simulated to be 32 km/h with a 6 minute peak-time headway.

Transit times to downtown with the Spadina extension and the Eglinton subway
Data, imagery and map information provided by MapQuest,
Open Street Map and contributors, CC-BY-SA.

Eglinton and Finch LRTs

I was able to find information about the Eglinton LRT stops and simply fed those into the model. The Eglinton LRT goes along Eglinton from Jane to Kennedy. The stops from Keele to Laird are underground while the rest are below ground. When moving between underground stations, I modeled the LRT as moving at 32 km/h. At other times, I modeled the LRT as moving as 22km/h. I kept the 6 minute headways the same as with the Eglinton subway. There is talk that this plan might include rebuilding the Scarborough RT as an LRT that goes from Kennedy to Scarborough Town Centre and then to Sheppard, but I couldn't really easily find any reports on this, so it might just be talk.

For the Finch LRT, I was able to find a list of stops in the Finch BRT plan. I modeled the Finch LRT as moving at 22km/h with 5 minute headways.

Transit times to downtown with the Spadina extension, Eglinton LRT, and Finch LRT
Data, imagery and map information provided by MapQuest,
Open Street Map and contributors, CC-BY-SA.

Commentary

In the end, it turns out it's really hard to tell the difference between these two transit plans. Part of the problem is that my maps only have a new colour for each 20 minute interval, so commute times have to differ by at least 10 minutes before it's even noticeable on the map. Overall though, neither the subway or LRT plans provide large transformative changes compared to the other.

The Eglinton LRT is essentially the same transit line as the Eglinton subway except that it pops up above-ground for a few small sections. These two small sections are rather short and don't have a major effect on commute times. As such, both plans have largely the same effect in the central portions that are underground or near the underground sections.

Both plans provide a significant benefit to people living on Eglinton in the west. They seem to see good improvements in commute times to downtown. The Eglinton LRT extends a little bit further west than the subway, reaching Jane Street instead of ending at Black Creek. The map seems to suggest that this extra stop gives the LRT an additional benefit over the subway for people living west of Jane Street. I'm not sure if the extra 1.5km of LRT actually makes that big a difference in travel times. It might just be an anomaly with the way the LRT schedule lines up with the local bus schedule in the simulation.

Both plans show some small benefit on Eglinton between Yonge and the Don Valley. That area was actually already well served by buses and express buses, so the subway and LRT don't provide huge improvements in commute time to downtown.

The LRT shows some improvements in commute times for people living on Eglinton between the Don Valley and Victoria Park. The subway shows even more improvement for the people who live in that area.

Once you go east of Victoria Park though, things become a little murky. My maps seem to suggest that east of Victoria Park, it's no longer clearcut as to whether you should go downtown via Eglinton or take a bus down to the Bloor Danforth subway and ride that downtown instead. If a subway is built, it looks like it would be slightly better for commuters in the area to take the Eglinton subway over going down to the Bloor Danforth, especially if bus service ends up cut later on to fund the operation of the Eglinton routes. If the LRT is built, then it's probably better to take the bus down to the Bloor-Danforth than to take the LRT.

Once you actually get to Kennedy station though, I think it really becomes a toss-up as to which route you should take. If Metrolinx was overly conservative on the speeds for the Eglinton subway, and the TTC schedule for the Bloor-Danforth subway is optimistic then maybe it makes sense to take the Eglinton. Otherwise, my maps seems to suggest that the Scarborough RT and Bloor-Danforth subway actually already work pretty well together in providing fast service to Scarborough south of the 401. The main benefit of the Eglinton subway was that you wouldn't have to transfer at Kennedy onto the Bloor-Danforth subway not that it would actually get you downtown significantly faster. If the Eglinton LRT is built, then it is definitely better to take the Bloor-Danforth subway downtown than to travel along Eglinton.

One caveat of my discussion so far is that these maps show commute times to downtown. For people in Scarborough traveling to midtown, then there is a noticeable difference between the Eglinton LRT and Eglinton subway (7 extra minutes). For Scarborough people going downtown though, the Eglinton subway doesn't provide a huge benefit.

The main benefit of the Eglinton LRT over the Eglinton subway though is that it frees up money to spend on an LRT along Finch. The maps show that a Finch LRT provides a small improvement in commute times. Compared to existing service, the Spadina subway extension will actually provide a huge benefit in terms of improving commute times to downtown from the Jane and Finch area. The building of an LRT will only provide some modest improvements over the existing bus lines by about 6-7 minutes or so for people living around highway 400. It's not a huge, transformative gain, but that and the more reliable service that comes from having dedicated transit lanes are nice.

Big Picture

To be honest, I was a little disappointed at how little difference there was between the two plans. I was thinking that there would be huge swings in transit times depending on whether you included the full Eglinton subway or Finch LRT. Instead, all this arguing and fury over the plans are mostly about minor 5 minute differences in commute times at very specific points in the city. The colour differences on my maps are so small that I find it hard to tell the difference between actual differences in commute times and calibration problems with my monitor.

Personally, I think the main thing that all this simulation has shown me is how much Toronto's obsession with subways and trains has poisoned its transit planning. The people of Toronto have literally spent decades arguing over how to find the billions of dollars of funding necessary to build new LRTs and subways. Yet small, modest, and cheap improvements that provide real benefits like buying double-decker and articulated buses or setting aside land for future transit corridors have been completely ignored.

I was actually thinking of doing some further simulations comparing the difference between a Sheppard subway extension to Victoria Park vs. a Sheppard LRT, but my current simulation results suggest that it would be pointless to do so. All the experts and consultants already know what the results of these sorts of simulations are. All that bluster and anger are about whether a billion dollars should be spent to let people at Victoria Park get to downtown 10 minutes faster or whether a billion dollars should be more evenly distributed so that people all along Sheppard can zip down Sheppard 5 minutes faster. In the end, I don't think either plan makes a big difference to anyone.

Eglinton Subway vs. Eglinton LRT: Who cares?

So I've spent a day combing through transit reports in order to gather the data needed to simulate the effects of the Eglinton LRT and Eglinton subway transit plans on commute times to downtown. I generated the necessary maps and am now able to carefully compare their effects.

Conclusion: who cares?

I've had to bring out a microscope to be able tell the difference between these two plans. At maximum zoom, I can detect a subtle difference in shading in a few places. It turns out that neither of these plans are big game changers. Apparently, the laws of physics win out. If you live far away from downtown, then it will take you a long time to commute there. If you live on the edges of the inner suburbs, you may end up saving 10 minutes or so on your one-way commutes. If you live in outer Scarborough, you likely won't see any real difference.

I'll write this up more properly later.

Friday, March 09, 2012

I need to get there by 9ish: Refined Toronto transit time map for the TTC and GO

So I've now refined my program for visualizing the transit times for Toronto. First of all, I've now included GO Transit data in the simulation. As I suspected, it didn't have much of an effect on the results. The distances in Toronto are just to small and GO Transit service is simply too infrequent for it to show up in a simulation of this type. Although taking a GO train might get you downtown faster than riding the bus, because the GO train comes so infrequently, you will likely arrive too early at your destination and end up waiting around. Since this simulation tries to calculate the latest you can leave from a location and still arrive at your destination by 9, the GO train ends up faring poorly. It may be possible that my 100m walk restriction may have resulted in GO train stations being unreachable (since GO Transit is mostly designed for car drivers, most of the stations are surrounded by giant parking lots, and you may have to walk more than 100m from the nearest bus stop to get to the train platform), but I don't really see any evidence of this sort of behaviour on the map, and I think the TTC does try to send buses as close to the platforms as possible. I'm not sure how frequent GO Transit service would have to run before it would show an effect on the map. I'll perhaps have to run a simulation one day to figure that out.

Data, imagery and map information provided by MapQuest,
Open Street Map and contributors, CC-BY-SA.

Another change I made was to discard bus stops from the map that do not have any service between 7am and 9am. These are likely late-night bus stops that don't normally receive service during the day, so it was misleading to include them in the final map of rush hour transit times.

And finally, I no longer simply calculate the best way to get to downtown by 9. I also calculate the best way to get to downtown by 8:55, and I then average the two results. This helps ensure that the data isn't skewed because the schedule shows a perfect transfer of buses (e.g. a bus that only runs every 30 minutes, but just happens to arrive just before another rare bus so that you can get to downtown without waiting--even though in practice, if the timing is just a little bit off, things don't line up and you end up 30 minutes late). By averaging the results for different times, it reduces the magnitude of this effect.

In the end though, the final map visualization looks pretty much the same as the previous one I generated except for the extra GO Transit routes.

Monday, March 05, 2012

I need to be there by 9: Transit times in Toronto aboard the TTC

I recently realized that with all of the open data initiatives pursued by the government, it's possible for amateurs to grab this data and process it themselves. That's, of course, the whole point of open data initiatives, but I hadn't realized until the last few months that that included me.

With my interest in transit, I decided to grab some of the online transit data and see what I could do with it. Since Google has managed to convince everyone to put their transit route and timing information online in the form of GTFS, I decided that a good first step in playing with transit data would be to grab this GTFS feed for the TTC and chart out transit times in the city. The result is below. It shows how early you have to leave from various parts of the city to reach either Queen or Osgoode subway stations before 9am.

Data, imagery and map information provided by MapQuest,
Open Street Map and contributors, CC-BY-SA.

Why downtown during rush hour?
I decided to chart out transit times to downtown during rush hour because this is the most common transit pattern. Most people want to go downtown, and they need to go during rush hour. Even if you do want to go somewhere other than downtown, it's often fastest to still go downtown first because transit systems are usually optimized for taking people to and from downtown the fastest. Of course, the chart isn't representative of off-peak travel times. And sometimes transit systems are overly oriented towards moving people downtown, so it might actually easier to go downtown than to go to the local mall on transit, for example. I was thinking of maybe trying to calculate the transit times to various points in the city, or maybe calculate the transit time needed to reach 70th percentile of the city from any particular location, or maybe the time needed to get 5km away from any location, but doing that would require more complicated coding, and I was feeling lazy, so I opted for the simpler calculation of how long it takes to get downtown.

Simulation Setup
I took the GTFS data for Wednesday, February 15. I took all of the bus stops and transit stations in the data feed and used the transit time information to simulate how long it would take to move between the various stops. In the transit simulation, people are able to walk between stops for a distance of up to 100m at a time at a speed of 4km/h. In reality, some people are willing to much further than 100m if it saves them time, but the 100m restriction prevents the simulation from letting people do things like walk across a highway or across the Don Valley or something. During the simulation, people might walk for a total distance of more than 100m, but they will only walk up to 100m between any two stops. The walk time is simply the time needed to walk the straight line distance between the GPS coordinates of two stops--it doesn't take into consideration buildings in way, terrain, crossing the street, etc.

Weaknesses with the Model
One of the weaknesses of this approach is that it assumes that transit system adheres perfectly to the GTFS data. In reality, buses don't arrive and leave perfectly according to the schedule. People don't try to arrange their bus transfers so that they have exactly 5 seconds to make a transfer. Buses and subways are often late, so people normally would take earlier buses than what the chart shows because they need to ensure there's enough buffer time to make a transfer. There is no sensitivity analysis done on the data--so if I were to run the simulation again but calculate when people would need to leave to get to downtown by 8:55, the calculated transit times might end up being dramatically different. Also, the rule of thumb is that people consider waiting for a transfer to feel like twice as long as actually riding on transit, and that "feel" isn't incorporated into the model either.

Obvious Anomalies in the Map
If you look at the map, you'll notice that there are a lot of purple dots along the Bloor subway line. I suspect that those dots refer to a late-night/early-morning bus that runs when the subway is closed. Due to the 100m walking restriction, those stops can only be "reached" by those overnight buses, but those buses probably don't run during the rush hour since people can just walk a few hundred metres and take the subway instead. I haven't verified this yet.

Previously, I thought that GO Transit doesn't come often enough and that the distances within the city were small enough, that simulating GO Transit wouldn't be useful. But looking at the map now, it looks like that if GO trains come at least twice an hour, then they might provide a benefit for people living at the outer limits of the city, so I might have to incorporate GO data into the simulation later as well.

On the map, you can see the effect of subway lines in the suburbs. Around the subway stations, the transit times are much lower than in the surrounding areas. If you live near a subway station, then you get decent transit times to downtown. Otherwise, you need to take a bus to the station first, which increases your travel time. This effect is particularly prominent on the Sheppard subway and the eastern end of the Bloor subway. Overall, having a subway somewhere in the neighbourhood does seem to improve overall travel times though.

Another annoyance with the way this map is generated is that it's hard to separate the effect of frequency of service from the effect of speed of service. Are poor transit times in some areas caused by infrequent service so people have to wait a long time for transfers or are the times caused by buses moving slowly through congested parts of the city? It's also hard to compare public transit times vs. car times because there's no easy way to calculate how long it takes to drive through the city during rush hour. Oh well, I think it makes a good first step at analyzing some of this public data though.

Monday, February 27, 2012

February 2012 GO Transit Zone Map

GO Transit has a fairly opaque pricing scheme. Although the system is based on fare zones, they never tell you how the fare zones work. When you pay, do you pay according to the number of zones that you pass through? Or do you pay by distance between the zones? If you have a monthly pass between two locations, do you also get free travel on GO within those zones that are between those two locations? What happens if there are two routes between two locations, possibly using different zones? Maybe your fare pays for travel in one central zone plus all nearby zones up to a certain distance? Can you tack on an additional zone onto your travel at lower cost?

Fare zone systems can get pretty complicated, so I understand that, for normal people, GO only provides a little online web page where they can enter a source and destination and have GO spit out a total fare. But for the curious, how does the system actually work? For individual bus stops, GO does tell you their fare zone, but I couldn't find a map of all the fare zones on the GO web page. Apparently, even in the GO Transit bylaws discussing the cost of travel, it simply lists a bunch of destinations and then lists the fares for travelling between different pairs of destinations.

So, in an effort to get some insight into how the GO fare zone system worked, I decided to generate a fare zone map for GO. First, I went to the GO website and downloaded the list of stations, their GPS coordinates, and their fare zones. Then, I put them into the data into a Voronoi diagram generator with the latitude and longitude simply used directly as x and y coordinates. I used Shane O'Sullivan's rewrite of Steven Fortune's original Voronoi code--I needed extra precision though, so I had to use a search and replace to change all the "float"s to "double"s. From there, I took the Voronoi diagram, did a Mercator projection, generated an SVG diagram, overlayed it on an OpenStreetMap map, and I was done.

Data, imagery and map information provided by MapQuest,
Open Street Map and contributors, CC-BY-SA.

My use of Voronoi to generate the diagram obviously has some problems. GO does not provide ferry service, so the zones do not actually extend into Lake Ontario like the diagram shows. There might be zones missing because there are no stops there. Zone 60 ended up being split into two parts because there wasn't a station between Richmond Hill and Vaughan to force the zone to merge together.

Unfortunately, the fare zone map didn't provide me much additional insight into how the fare zone system of GO might work. The zones seem to be irregularly sized, so I think that maybe the GO fare zone system may be more ad hoc with no particular attempt made to let riders buy arbitrary travel in certain zones (i.e. it's optimized for a commuter travel pattern where a commuter always takes the same journey between two locations every day). Oh well.

Saturday, February 04, 2012

Eglinton Subway: the Nuclear Option

Eight billion dollars! Eight billion dollars so that people in Scarborough can ride the subway to Eglinton? Are you kidding me? People want to go downtown. They don't want to go to Eglinton. Why are we adding half an hour to everyone's daily commute? Every single day, the amount of time wasted over all those people will be measured in YEARS. We're going to spend $8 billion dollars to build a vast transportation system for moving people some place that they don't want to go?

This is the worst kind of government waste. It's pure ego-driven excess. Times are tight, yet the politicians somehow think it's ok to waste taxpayer dollars on some pie-in-the-sky dream? This gravy train has got to stop.

Background

I thought the small-time transit politics in Ottawa were bad, but in Toronto, they're even worse. Right now, in Toronto, they're debating whether to spend $8 billion for a subway on Eglinton or $6.5 billion for a subway/light-rail combination. The debates are heated. The politics are nasty. And it's all pointless because both plans are a colossal waste of money.

The sad reality is that the smaller the government, the smaller the politicians. At the federal level, we get our best and brightest politicians. At the provincial level, we get some medium-level politicians. At the local level, no one votes and no one cares, so we end up with what's leftover: the least-experienced small-town politicians in the country. Yet somehow we're entrusting this lowest level of government with the decision on what to do with the biggest consruction mega-project in Canada?

This $8 billion dollars isn't local money either. It's provincial money. From Ottawa to Thunder Bay, Windsor to North Bay, we're all going to be paying higher taxes so that the local politicians in Toronto can strut around like peacocks boasting how they built a subway.

This is provincial money that's being spent, and it's time that the more experienced and more accountable politicians at the provincial level impose some provincial level-headedness on how our money will be spent.

The Case Against Subways

Trains and subways are shiny little toys, and politicians love spending money building them. The problem with trains and subways is that they are almost always a waste of money. They're always over-designed and over-built, and usually the same transit results could have been achieved using much cheaper technology.

Which mode of public transportation is the fastest? The bus.

Which mode of public transportation comes most often? The bus.

Which mode of public transportation is the cheapest to build? The bus.

So why exactly are we spending $8 billion dollars on a subway? Because they're nice? We're spending $8 billion dollars for "nice?" No politician who understands transit would propose building a subway. If you want immediate improvements in transit for the least amount of money, you always go for buses. There are a few limited situations where it makes sense to build subways and light-rail. But 99 times out of 100, if a politician is trying to build something involving trains, then they're wasting your money.

Politicians generally don't use public transit, so their only experiences with public transit are often only from when they're a tourist in foreign country or from the times of their youth. The systems they've used tend to be old and out-dated. Public transit technology has evolved and improved tremendously over the last few decades. You cannot use 1960s transit experiences to guide the building of transit networks in 2010s.

A modern subway is designed to move hundreds of thousands of people per day. This is the size of a medium to large Canadian city. Unless you need to move an entire city's worth of people from one location to another, you do NOT build a subway. Unless the entire population of Scarborough suddenly decided to give up their cars and ride the subway to Eglinton every day, it would not be worthwhile building a modern subway between those two locations. The current ridership on the Eglinton bus is around 50 thousand per day. Notice that 50 thousand is nowhere near the 500 thousand number you want for a subway. The technology for buses and streetcars have advanced to the point where they can be stretched to handle ridership below the hundred thousand range at a fraction of the cost of building a subway.

Many of these subway proponents try to justify a subway by claiming that decades from now, the city will grow to a size where a subway would make sense. This is pure wishful thinking. Toronto already has a backlog of necessary transit improvements that we need done now. But instead of spending money on fixing those problems, we're going to spend money to fix a problem that may or may not happen 50 years in the future? How does that make sense? Do these proponents have some deep insight into the transit needs of Torontonians? The sad fact of the matter is that most subway proponents are car drivers who have never ridden public transit before, and certainly wouldn't use the subway after it has been built either.

The reality is that we live in the age of the car. In the 50s and 60s, Canada was poorer and cars were more expensive, so people rode more transit. The public transit system was actually profitable! Construction was also cheaper back then, so building subways was very affordable. Cities were designed for people riding around in public transit and walking everywhere. But we're richer now, cars are cheaper, and we've redesigned our cities around the car. The city is more spread out, and most people get around by driving. The government does have a responsibility to provide adequate transportation for those who are unable to drive, but the idea that we're going back to days when everyone rides transit everywhere is a pipe dream. If you own a car, and you don't need to go downtown during rush hour, then it never makes sense for you to use public transit. Building extravagant transit mega-projects in the hope of encouraging these car drivers to use transit is just a waste of money. We do need transit, and we should spend money to build a better transit system. But subways are definitely inappropriate for a modern day transit system.

California, you may be surprised to hear, is the birthplace of the anti-tax movement. Despite their reputation for high taxes, Californians were the first to become fed up with their tax dollars being wasted by government, so they held a referendum and banned tax increases. Do you know something else they did in California? They banned subways. They became fed up with their tax dollars being wasted on costly and unnecessary subway projects, so they held a referendum and banned their tax dollars from being spent in subways in LA. A similar policy would definitely be appropriate for Toronto.

The Nuclear Option

With all the bickering in Toronto about whether to build a subway or subway/LRT-combination, or about whether the mayor's decision to build a subway was even legal, now would be an appropriate time for the premier to make use of the nuclear option: he should cancel the subway.

In this time of fiscal austerity, we can ill afford to squander $8 billion dollars on an unneeded subway. For $8 billion dollars, we could run the TTC at no cost to the city for 10 years. Mr. McGuinty knows that the subway is a waste of money, which is why he was so reluctant to fund it in the first place. With a report coming in February on all the massive government cutbacks needed to balance the budget, he has the political cover necessary to kill it. In fact, he doesn't even need to do it himself. He could simply let Metrolinx kill it for him, allowing him to avoid any possible political repercussions

Afterward, he should then impose some restrictions on the funding of new transit projects in Ontario, so that mayors stop wasting taxpayer money on these ridiculous transit plans.

Transit Funding Restrictions

There is a well-known story about a university that built many new buildings with grassy fields between them. The university had to build paved walkways over the grass between the buildings, but it couldn't decide where to put them. Should the walkways go along the sides of the buildings? Or maybe all the paths should meet at a single point in the middle of the field? In the end, the university had a brilliant idea. They planted the grass but didn't build any walkways. Over the next year, students walked over the grass, wearing it down, and making dirt paths through the grass. At the end of the year, the university simply paved over the dirt paths, so that the paths followed the natural movements of the students.

The Ontario government should impose transit funding restrictions to require this sort of thinking on transit development. Instead of expensive grand schemes for redesigning the transit movement of Ontarians, we should have incremental, modest improvements that serve a proven need.

In particular, no funding should be given to light-rail projects unless an existing bus bus transit line with high ridership has proven the need for one. And no funding should be given to new subways unless a light-rail line with high ridership has proven the need for one.

Even with these restrictions, it is inevitable that some unnecessary and poorly conceived projects will be built. To limit the amount of taxpayer damage that result from misguided projects, there should also be a billion dollar limit on individual transit projects. Some may argue that this limit prevents the building of larger, transformative transit projects. I would argue that that is a good thing and that if a city has a good long-term transit plan, then there's no reason a large project can't be broken into smaller incremental parts.

Soothing the Pain

Canceling the subway will make a lot of people angry, so the Premier would have to offer some alternative projects as a compromise. Following the grand government tradition, these alternatives projects can be promised, but never actually built. To keep the Eglinton people happy, a billion dollars of bus tunnel will be built through the most heavily trafficked portions of Eglinton. To please the transit advocates, one of the Queen or King streetcar lines should be buried through parts of downtown. The left will think of it as a down-payment on a downtown relief line while the right will be happy that they eliminated a streetcar. The Scarborough people will get an electrified GO-train line to downtown, possibly with a spur line to Scarborough Town Centre so that it can form a proper transit hub. The Etobicoke people are already getting an airport GO-train, and it should be modified to serve their needs better. As for actual transit users, the people we're actually supposed to be helping, they'll get funding for bus rapid transit, lots and lots of it.

Saturday, January 21, 2012

Egit and Github: Pulling from a Master Repository and Pushing into Your Own Forked Repository

So I'm new to the world of git and Github, so I'm trying to figure out how all this stuff works. Making things more complicated is the fact that I'm using Eclipse and its egit plugin, but most of the documentation on working with Github assumes that you're working with the command-line version of git. Although there is some documentation on how to use egit with Github, it mostly assumes that you created your own repository in git for your own projects. Github encourages people to take an existing project, fork it, and then work with your own copy of the repository though. This then makes things difficult when you need to pull in changes from the original master repository and stick them in your own repository.

Again, there is documentation on how to do get your forked repository in-sync with the master repository, but I'm still trying to feel my way about how to do this in Eclipse with egit. So far, here are some sets of instructions that seem to work with Eclipse Indigo 3.7 (but I'm still feeling my way around though, so I could be wrong).

So I'm assuming that you've forked a repository on Github, and you've successfully used the egit defaults to make a local repository based on your fork. So in the Git Repository Exploring perspective of Eclipse, you go into your repository.

In the Remotes section, right-click and choose "Create Remote"

Set the name to "upstream" and configure it for fetch
the URI should be set to the git URI of the original master repository on Github that you forked
"Add" a RefSpec
the source should be master, and the destination should then be automatically filled in to refs/remotes/upstream/master
Save your changes

In the Remotes section, expand the upstream configuration, right-click the green incoming arrow, and choose Fetch
After fetching, I think the latest changes are now available in your local repository, so now you want to apply those changes to your local branch, and then push them out to your forked repository
So make sure you have your local master checked out or whatever
Right-click your repository and choose Merge...
Merge in from upstream/master
And then you can push everything back into your forked repository on Github with a simple Right-click on your repository and Push.

It looks like egit might also support some sort of special Upstream configuration or something, so that egit will automatically track the master version of a fork, but it looks like the forked repository needs to be configured a special way, and I don't know how to get Github to do this. I imagine if you dig into egit's property files, there'll be some way to enable it, but I can't figure it out yet.

Monday, January 16, 2012

Thinking about Diagrams and Figures with Constraints

I have a small website for teaching kids to program, and I've long wanted to translate it into French and other languages. Being a website geared towards kids, the website has lots of diagrams and figures. Many of these diagrams contain text, so I would need to new versions of about 100 diagrams for each additional language I wanted to support with my website. Obviously, sitting around redrawing the diagrams for each new language would be impractical, especially for languages that I'm personally not familiar with. So I need some system that will let me define my diagrams in a generic way, substitute in different translations for the text, and then generate different versions of the diagrams for different languages.

The crux of the problem is that text might be different sizes in different languages. In English, you might use one word to describe something, but another language might use 5 words to describe the same thing. As a result, the diagrams need to lay themselves out differently depending on the size of the text. I've been trying to find a tool to do that, but I've been having problems figuring out what terms to use in the search. Right now, I'm just trying to understand what sort of tools are available in the space or, given that I probably have to write a chunk of code to substitute in the text translations anyway, how hard it would be to write a simple one from scratch.

Internationalization is a common problem in user interfaces, so coders commonly use a constraint-based layout scheme for their user interfaces in order to avoid this problem. Basically, you have some simple patterns or templates that define how UI elements should be arranged with respect to one another, and then you stick UI widgets into those templates. So you might have a row of widgets, or a grid of widgets, etc. This doesn't necessarily work too well for diagrams because the layouts tend not to be as rigid as in UIs. I was thinking that I could build a more complex version of the system where the diagram designer simply inputs arbitrary layout constraints (e.g. the right side of this element should touch the left side of this other element), and then everything would be fed into a solver of a linear system of equations to get the final layout.

I'm a little concerned though that it might get tedious describing all of the constraints needed to solve a full system. Plus, these constraints might be too rigid in that you can't say, "I need a circle as big as possible to fill that space between the elements" or "make the font size as big as possible as long as the width isn't too big." Of course, doing something like that might require a system that uses linear programming or some other mathematical optimization technique underneath, and it might be tricky to translate constraints from a diagram design into the proper set of inequalities and objective functions without totally confusing the diagram designer.

There are various "constraint-based diagram editors" available, but most of these tools are designed from drawing large graphs (e.g. org charts). Given a bunch of inter-connected diagram elements, these tools will use spring systems or other techniques to find a good planar embedding of the graph. Some of my diagrams could be laid out this way, but I'm not sure if this approach is simultaneously too heavy and too underpowered for the types of diagrams I want. My diagrams don't really fit into those graph models, so I'd be worried that the mismatch of diagram types would make it difficult to get the results I wanted.

I also looked at some constraint solver systems that I could maybe build up into a diagram tool since that would give me the most flexibility in terms of how constraints could be expressed. Unfortunately, constraint solvers are actually a technical term used to refer to specific AI-like problems like SAT, etc. They specialize in solving binary constraints. Although they can also work with real numbers, it seems that some of them solve the problem by discretizing the real number line and then using standard binary constraints to solve things.

Perhaps what I need is some sort of symbolic computation engine that can work with intervals. Does that even exist? Perhaps I'm going overboard with this whole diagramming thing.