Wednesday, February 22, 2012

Today's topic? Topics!

< reorganization >

Last week we at Khan Academy revealed the first pieces of a long-term project to reorganize the content on the homepage and throughout the site. You may have noticed the removal of a number of playlists, notably "Developmental Math" and "Pre-Algebra", and the appearance of newly organized topics called "Algebra" and "Arithmetic and Pre-Algebra". We have grouped the videos under each topic into subtopics as well, to better expose how the thematic building blocks come together and the logical ordering between them. We think showing the structure and consolidating related subtopics under one supertopic is a big improvement over the somewhat haphazard collection of playlists that had evolved over time. Here is what one of the new topics looks like:

The shift in terminology from "playlist" to "topic" is significant: Playlists are completely linear and imply a passive mode of consumption. The concept of a playlist does not include exercises (although the videos in the playlist do have related exercises) and so the natural flow of watching a playlist precludes stopping to do an exercise to practice the knowledge that you've just absorbed.

< topics >

So what is a topic? A topic can represent a single concept or related group of concepts, and unlike playlists they are arranged in a tree. So "Math" is a top-level concept, with "Algebra" beneath it and "Solving linear equations" beneath that. Within these lowest-level topics is our content: Not just videos, but eventually exercises and any other tools that we come up with to teach that topic. The ordering is meaningful so a student can look at the content in a topic and see what relevant videos and exercises teach this concept and the logical order to complete them.

This has both immediate and long-term effects on how students find and use our content. In the short term, we hope that specific and relevant content will be easier to find on the homepage, and there should be less confusion about what concepts that student is actually learning in each topic. In the longer term, we want to encourage a more active and nonlinear mode of participation: students will have the option to interleave videos and exercises for a more active learning session, and students will be able to tackle specific topics by setting goals and getting feedback on a more granular level how they're progressing. This will also help us clean up the Knowledge Map so it's easier to read and navigate.

< api >

One important note for users of the public API: If you have been using to get a list of playlists for displaying our videos in your application, you will want to migrate to the new topic tree API: The playlists are becoming less coherent as we reorganize content and will eventually be deprecated in favor of the new organization, and all the new features in the API will be referencing topics. We look forward to seeing what navigation methods you find effective for the topic tree.

< ta-da! >

We are really proud of this architectural and pedagogical improvement. We hope to have all the topics under Math organized in the next few months, and to start rolling out exciting new features that leverage this infrastructure soon. Watch this space, and please leave your feedback in the comments!

< / tom >

Friday, February 10, 2012

Caught in a web: Transitioning from native to web development

About four months ago, I joined the development team at Khan Academy. Working at Khan Academy has been amazing: every day I work with talented, motivated and intelligent people to bring innovative solutions to the nascent field of online education. However, that's not what I'm writing about today.

Before I started this job I worked in computer game development, and my programming language was C. Not C++, just plain old C. Now, I have programmed in many languages – assembly, C++, PERL, Java, you name it. But making the transition after five years writing native code for the PC to full-time web development feels like a radical shift, even though in some ways it shouldn't be. I am setting out to document these differences to serve as a guide to others making the same transition. I suspect a lot of programmers will find themselves in the same situation over the next few years as the complexity (and profitability) of web applications continues to increase.

As a programming language, C is as simple and straightforward as it gets. Compared to C++ or Java there is less structure and no built-in extensive libraries of data structures and utilities. Optimization can be done at a dizzyingly low level, with control over data structure size, method inlining and even the assembly itself if you want it. You end up writing your own dynamic arrays, allocators, and networking stacks. A seasoned programmer can save megabytes of memory by shaving a few bytes from a key structure, or speed up traversal of a tree by several orders of magnitude by making sure adjacent nodes are in the same cache line. Some programmers are so proud of their skills in these arts that they flat-out refuse to work in an interpreted, memory-managed language. Luckily, there will always be some demand for bare-metal optimization in embedded devices, real-time operating systems, etc. However, I've found that in many cases the most straightforward way to solve a performance problem in any language is simply to do less work, and that involves asking tough questions about what data you absolutely need when, whether calculation can be done in the background or deferred or on a remote machine, and how aggressively to cache results. These skills transfer very well to web development, and I haven't yet seen a case where performance suffered and there was simply no remedy. It's just a different trade-off: rather than code being optimized until it's unmaintainable, data is heavily cached, increasing the penalty for code changes if caches must be rebuilt or migrated.

One great upside of web development is the iteration cycle. A full build of a mature game can take anywhere from minutes to several hours, and most PC games take a minute or two to boot up (longer if they are built in debug mode). This means that the time between making a code change and seeing that change in a running game can be 15 minutes or more. (Anybody who points to MS Visual Studio's Edit-and-Continue feature is invited to try it on a million-optimized-file code base!) Even the most trivial change can take an hour to implement. In web development, the time between write and test is more or less the time it takes to hit Refresh in the browser. This has freed me from hours of compilation and startup time. I can't stress enough how much of a difference this has made for my productivity.

Now, when it comes to debugging, things are more of a mixed bag. The Visual Studio C debugger is very capable and has some really powerful features. I can't count all the times I set a data breakpoint and found someone misusing a variable. On the other hand, I can't count all the times a data breakpoint has helped me find a buffer overrun or someone writing to freed memory, things I never have to worry about now. In the case of JavaScript, each browser has its own debugger, all of which seem to have “borrowed” each other's features and all of which seem roughly equivalent. The availability of eval() in JavaScript is both a blessing and a curse: Blessing because I can do pretty much anything I want to the running code in the console; curse because browsers don't handle debugging dynamically generated code very well. Then there is Internet Explorer and its own very peculiar bugs. And if you're debugging server-side code in PHP or Python, I have not progressed beyond spamming to the error log. (If you know of a solution for Python in Google App Engine, please let me know!)

I'll go into more detail in future posts, but to sum up: Even though these languages are difficult to get used to if you're used to native C/C++, they do make some things easy and the development tools are constantly improving. While many PC games have updates spaced months or years apart, we are able to deploy code several times a day, sometimes several times an hour. It's a different software development mentality and one that's exciting and invigorating, because it's not about shipping perfect, elegant, bug-free code - what's important is the inspirational product we are delivering for free to students of all ages all over the world.