Saturday, December 31, 2011

Reword your resolutions

I had various degrees of (lack of?) success setting goals. Most of the goals I had were pretty standard and dryly worded: "one tweet per day", "finish reading data viz books", etc. Even though these are things I really want to do, just reading the list was a little draining.

But one goal was different. It was written:
Meet Colbert.
This was in reference to the Hacking Education contest. Entering the contest was really an excuse to practise playing with data and writing about the results in my new data blog. Colbert wasn't even relevant. Even then, those two words made me smile. It excited me, so much that I just had to work on it... Although I didn't end up meeting Colbert, I accomplished what I was really after.

It's so much easier to stick to goals that make you smile, that motivate you, and that capture the desired result rather than the hard work required (even though the hard work is just as important). So if you're setting goals, make them interesting. Make them playful. Make them funny. Figure out what excites you, then use that as a proxy for what you actually want to accomplish (e.g. "meeting colbert" as a proxy for "start a data blog and write stuff in it").

Happy 2012 and if you have some resolutions in mind, good luck!

End of Entry

Tuesday, December 13, 2011

Galois Theory in 1500 Words

[This is a very brief overview of Galois theory. It's not meant to be rigorous, so apologies in advance for leaving out a lot of details and avoiding some delicacies. It was meant to be intuitive and light on math, but it turned out to be neither. If you aren’t familiar with field and group theory, well... proceed at your own risk...]

For a long time, people wondered whether it is possible to write down something like the "quadratic formula" for cubic, quartic and quintic polynomials with integer coefficients. We now know that for cubic and quartic polynomials, this is possible. But for degree 5 polynomials and beyond, it isn't. A proof of this was scribbled down hastily by Galois the night before his duel. Galois linked together field theory and group theory in a beautiful way to answer this very question.

Galois’s Approach: The Big Idea

What does writing down a “formula” for roots of a polynomial really mean? For one, we’d be writing down the roots in terms of rational numbers and a combination of +, -, x, ÷, and radicals (taking n-th roots). This is a very limited set of operations, and certainly not all real numbers can be written this way -- π, for example, can’t be written this way. We say that π is not solvable in radicals.

Are the roots of polynomials with integer coefficients solvable in radicals? Those roots aren’t just any real number, and certainly π is not a root of any polynomial with integer coefficients. Yet Galois showed that there are some degree-5 polynomials with roots that are not solvable in radicals. To see how he did this, we first need some terminology about fields, field extensions, and groups.


The set of rational numbers Q is an example of a field: a set of things you can add, subtract, multiply and divide. We can “extend” Q into bigger fields by adjoining things to it. For example, L=Q(√2) -- pronounced "Q adjoined root two" -- is defined to be the smallest field that contains both Q and √2, and is closed under +, -, x, and ÷. Here elements of Q(√2) are exactly numbers that can be written in the form (a+b√2)/(c+d√2) where a, b, c, d are integers. We call such a field L a (field) extension over Q (written L/Q).

When we're adjoining √2 to Q to construct Q(√2), what we're really doing is adjoining to Q a root of the polynomial f(x)=x² -2. We could do this with other higher-degree polynomials. Let p(x) be a polynomial with integer coefficients (and no repeating roots). We define the splitting field K of p(x) to be the smallest field containing both Q and all the roots of p(x). For example, the splitting field of p(x)=x²+1 has roots i and -i so a splitting field K of p(x) is K=Q(i,-i)=Q(i). (The last equality is true because 0 and i are in Q(i), so -i=0-i is also in Q(i)).
Conversely, we call an extension K/Q a Galois extension if it is the splitting field of some polynomial p(x). From before, Q(i)/Q is a Galois extension.

Q-Fixing Automorphisms

An automorphism F of a field K is an isomorphism from K to itself, where the algebraic structure is preserved -- specifically, F(a+b)=F(a)+F(b), F(ab)=F(a)F(b). In the case that K is an extension of Q, we’re more interested in automorphisms of K that has F(x)=x for all x in Q (or that F fixes elements of Q). An automorphism F of K is a Q-fixing automorphism if it has this property.

There are two Q-fixing automorphisms of L=Q(√2): the identity automorphism (call it e) that takes each element of Q(√2) to itself, and an automorphism (call it t) that takes anything from Q to itself, and √2 to -√2. It is possible to show that there is exactly one such automorphism t.


A group is a set with a "composition" operation • with an identity element. From above, the set of Q-fixing automorphisms of K, denoted G(K/Q) is a group with • being function composition, and e being the identity element. Observe that we do not require • to be commutative (so a•b may not be the same as b•a in this case).

Two groups are isomorphic if they have the same algebraic structure -- i.e. they’re essentially the same group except the elements have different names. It is useful to see whether G(K/Q) is isomorphic to a group that we know and understand. Two important classes of groups that are well understood are cyclic groups and permutation groups.

Examples of Groups

The cyclic group C(n) of order n is the set {0,1,2,3,..,n-1}, with • being addition mod n. From the example of L=Q(√2), L has G(L/Q)={e,t} with • being function composition. This is actually isomorphic to C(2)={0, 1}.

Permutation groups consist of functions that permute some "letters". We'll use S(n) to denote the group of permutations of n letters. For example, S(3) is all the permutations of letters {a, b, c}. A function F that takes ab, ba, cc is one such permutation (and hence an element of S(3)).

The Fundamental Theorem of Galois Theory

Galois noticed that for a Galois extension K/Q, there is a link between "subfields" of K containing Q, and "subgroups" of  G=G(K/Q). The quoted words mean exactly what you might think -- a subfield of K is a field L that is contained in K (and is closed under +, -, x, ÷). For example, Q is a subfield of R, and Q is a subfield of Q(√2). A subgroup of G is a group H that is contained in G (that is closed under • and contains the identity element).  For example, the subset {0,2,4} is a subgroup of C(6) = {0,1,2,3,4,5}.

More concretely, there is a 1-1 correspondence between subfields of L and subgroups of H:
  • For a subfield L of K containing Q, there is a subgroup H of G corresponding exactly to the automorphisms that fix all of L -- i.e. f(x)=x for all x in L, not just Q.
  • The reverse is true as well: if H is a subgroup of G, then there is some subfield L of K containing Q that is fixed by all of H.
In particular, whenever L/Q is itself a Galois extension, H is a normal subgroup of G. Think of H being normal as H being well-behaved enough that the quotient G/H is another group. We won't get into what this means, but this group G/H turns out to be isomorphic to G(L/Q).

For a concrete example, let’s take K=Q(i, ∛2). It’s possible to show that K/Q is a Galois extension, and that G=G(K/Q) is isomorphic to S(3). In particular, the subfields of K are Q(i) and Q(∛2), and they correspond to subgroups of S(3) isomorphic to C(3) and S(2). We saw before that Q(i) is a Galois extension, and C(3) happens to be normal in S(3).

Linking Back to the Big Idea

Suppose p(x) is a degree 5 polynomial, and that it has a root x that is solvable in radicals. Then really, x is in some field K containing Q, where K can be "built up" from Q by successively adjoining (n√α), the n-th root of α, for some n and α. For example, take  x=√(2+ √5). We set L=Q(√5) and K=L(√(2+√5)) to "build up" K in this manner.

Solvable Fields

In general, we call an extension K/Q solvable if K=K0⊇K1⊇K2⊇...⊇Q, where each K(i-1)=Ki(n√α) for some n and α. This is exactly the construction we had a paragraph ago. As another example, K=Q(i, ∛2) is a solvable extension since Q(i, ∛2)=Q(i)(∛2)⊇Q(i)⊇Q is in the desired form (recall i=√(-1)).

For a polynomial p(x) in Q, the splitting field K of p(x) is the smallest field containing all the roots of p(x), so the roots of p(x) are solvable in radicals if and only if K is solvable.

Solvable Groups

We can actually assume (ignoring some subtleties) that each K(i-1)/Ki from above is a Galois extension. In this case each G(K(i-1)/Ki) is actually isomorphic to C(n) for some n.

Thus in order for a field extension K/Q to be solvable, G=G(K/Q) must be in a particular form: there has to be a chain of subgroups,  G=G0⊇G1⊇G2⊇...⊇{e}, where each Gi is a normal subgroup of G(i-1) and Gi/G(i-1)=C(n) for some n, and {e} is the trivial group with just the identity element. In the case of K=Q(i, ∛2), the chain of subgroups looks like G(K/Q)=S(3)⊇C(3)⊇{e}.

A Quintic Formula Cannot Exist

To recap, roots of p(x) being solvable in radicals requires the splitting field K of p(x) to be a solvable field, which in turn requires G(K/Q) to be a solvable group.

But with a little group theory, we can show that S(5) is not solvable. Further, any quintic polynomial with two non-real roots has Galois group S(5). These last facts require some more concepts to develop, but in any case -- not all roots of quintic polynomials are solvable in radicals.

End of Entry

Monday, December 5, 2011

Visualizing 4+ Dimensions

When people realize that I study pure math, they often ask about how to visualize four or more dimensions. I guess it's a natural question to ask, since mathematicians often have to deal with very high (and sometimes infinite) dimensional objects. Yet people in pure math never really have this problem.

Pure mathematicians might like you to think that they're just that much smarter. But frankly, I've never had to visualize anything high-dimensional in my pure math classes. Working things out algebraically is much nicer, and using a lower-dimensional object as an example or source of intuition usually works out -- at least at the undergrad level.

But that's not a really satisfying answer, for two reasons. One is that it is possible to visualize high-dimensional objects, and people have developed many ways of doing so. Dimension Math has on its website a neat series of videos for visualizing high-dimensional geometric objects using stereographic projection. The other reason is that while pure mathematicians do not have a need for visualizing high-dimensions, statisticians do. Methods of visualizing high dimensional data can give useful insights when analyzing data.

"But there aren't any more direction to draw the fourth axis..."

A professor would draw the usual x, y and z-axis on the 2D chalkboard, then announce that there aren't any more directions remaining to draw another axis until someone invents a 3D chalkboard. Then I would die a little.

Of course there is! How did we pick the direction to draw the z-axis on a 2D surface? We picked a direction not parallel to the x and y-axis, kind of at random (because any such direction would work). In doing so, we decided on a projection of the 3D space onto the 2D surface, like this:

There are some ambiguities as to which points in the 3D space these black points actually represent. In fact, each point above has an entire line projecting to it. Sometimes, authors would disambiguate points by dropping a line parallel to the z-axis down to the x-y plane, like this.

So, as we said before, we add a 4th dimension by drawing a t-axis not parallel to any existing axis. Here each point have an entire plane projecting to it, so we drop down another a line parallel to the t-axis to disambiguate. We get this.

So voila, we get a projection of 4D space onto the 2D surface of your screen. The following image from Wikipedia showing hypercubes of dimensions 1-4 uses the same method as above:

Technically speaking, this method can work for any number of dimensions, but things gets messy and confusing very quickly. Even in the 4D plot above it's challenging to wrap your head around what's happening -- dropping verticals parallel to all the other axes might help a bit, but even in 5D this would get pretty messy. Perhaps that's why the images of the hypercube stopped at dimension 4. 

So, are there nicer ways of visualizing even higher dimensional data?

Parallel Coordinates, star plots, etc

Yes, there are many, and they are often used in statistics. One simple technique is to map certain dimensions to other features of each "point" -- its shape, colour, size, etc. This is so often done that you probably wouldn't even think of it as a visualization technique, but it works for visualizing a few more dimensions. There are other techniques, too:

Radial plots have all the axes in a circle, like below, with each "point" or observation drawn as a polygon with vertices determined by its value along each axes. Sometimes it is better to make separate plots for each "point", since overlapping lines can make the chart messy. Again this works well for visualizing a handful of dimensions, and small number of "points" or observations.

Parallel coordinate plots are like unwrappings of the above. A "point" becomes a series of line segments connecting its values along each dimension. This chart is considerably less messy, so a larger number of "points" can be plotted on a singe graph.

Then there are goofier things like Chernoff's faces, which maps dimensions to features of faces. The idea is that since we are biologically hardwired to tell apart faces, we'd be able to easily tell which data points are similar to each other and which ones are different.

All of these plots make it relatively easy to find clusters within the data. However, it is difficult to find geometric properties: Can you imagine what points on a tetrahedron in 3D would look like here? A sphere? Thus from a mathematical standpoint, these graphics do not preserve anything that is usually of geometrical importance (angels, lengths, etc) -- stereographic projection fare better, but algebraic techniques are often quite powerful.

Visualizing High Dimensional Data is a Statistician's Problem, not a Mathematician's

To conclude, it is possible to visualize high dimensional objects. However, from the point of view of a pure mathematician, such visualizations are usually less helpful compared to algebraic techniques and intuitions on how low-dimensional object behave. Thus the problem of visualizing high-dimensions is a statistician's problem. Statisticians have much more to gain from visualizing their high-dimensional data.

So if you ever want to ask this question to a pure mathematician, ask a statistician instead. They'll be able to give you a better answer.

[PS: As @chlalanne pointed out, statisticians have built pretty good tools for visualizing high dimensional data. GGobi is a pretty powerful one. It interfaces with R through the rggobi package.]

End of Entry

Tuesday, November 29, 2011

It turns out

So it turns out that Douglas Adams had this to say,
Incidentally, am I alone in finding the expression ‘it turns out’ to be incredibly useful? It allows you to make swift, succinct, and authoritative connections between otherwise randomly unconnected statements without the trouble of explaining what your source or authority actually is. It’s great. It’s hugely better than its predecessors ‘I read somewhere that...’ or the craven ‘they say that...’ because it suggests not only that whatever flimsy bit of urban mythology you are passing on is actually based on brand new, ground breaking research, but that it’s research in which you yourself were intimately involved. But again, with no actual authority anywhere in sight.
I'm convinced that every professor knows about this, and chuckle to themselves every time they use this phrase.

End of Entry

Sunday, November 13, 2011

Bubble Sort

Bubble sort is the crappiest sorting algorithm out there. For the longest time I found it perplexing why it was taught in school, when insertion sort and selection sort are more intuitive, (slightly) more efficient, and easier to implement. Researchers mentioned in the Wikipedia page find it mind-boggling as well.

Then bubble sort came up in the most unlikely circumstances, in Galois theory, when proving that the roots of the polynomial x^5+15x+5 cannot be written using +, -, *, / and radicals (taking the n-th root). The key idea is that one can sort a list when equipped with two operations: rotating the entire list, and flipping two elements (at fixed indices).

This means that one can also obtain any permutation of a list using the same two operations. The fact that this can be done isn't difficult to show, but it isn't immediately obvious either (at least to me).

So the takeaway is this: bubble sort tells us that it is possible to permute a list by means of rotation and flipping two elements.

End of Entry

Thursday, October 6, 2011

Being a girl in tech

I wrote up several versions of this post over the last couple of years, but never really had the guts to hit "Publish".

The reason I was reluctant is because the world had been pretty kind to me. I'm a little ashamed to be bothered by experiences that pale in comparison to AnnMaria's and have few similarities even to Tracy's. Their posts should be read before mine, and their words should be taken more seriously than mine.

My school has a "co-op" system. We spend every other four months interning at companies. Starting from third week of class every term, those lucky enough start their interviews. At the end of week 6 or 7, we get to choose where to go based on which companies gave us offers. 

A good chunk of jobs are tech positions. At a time when every company is trying to boast the amount of female engineers they have on board, the few girls that apply for tech jobs seem to have a much easier time, getting positions that seem too good for their qualifications. This phenomenon was pretty widely talked about (at least around me), but unfortunately I don't have data to back up the anecdotes.

A stranger story I heard was about a company who told a male candidate that the company had "several qualified female candidates" lined up for next term, in hopes that this would convince him to take the internship offer.

Another company (unrelated to my school) was hiring their first female engineer. They decided that they really needed her to make the female:male ratio nonzero, and that they would continue raising her offer if she tries to take a position somewhere else.

These are all just stories, and some of them might not be bad per se. (Heck, even Google had a rule about how a certain percentage of their engineers had to be female!) But they did always make me question the internships I got and the reasons I got them. How much did these companies want me for my skills, experience and potentials, and how much was it about increasing "the ratio"? Worse yet, how many people who were more qualified than me were rejected because of me? 

But things turned out well for me, didn't they? Tracy pointed out that one thing that we can do to help women in technology is provide mentorship. I got that. I had wonderful mentors in all the companies that I interned at, and they helped me make real contributions while growing as a person. All the people who made hiring decisions were great people, who I'm sure tried to make the best decisions they could for their companies. So maybe this was a good thing.

Maybe this is the reason why my experiences differ from Tracy's.

I don't know.

The ethics of affirmative action is pretty difficult to ponder. All I can say is that it's not perfect. I've had my fair share of the Impostor's Syndrom, which it probably amplified.

Perhaps it's not too bad if it makes things easier for the next generation.

End of Entry

Wednesday, September 14, 2011

Engineering vs Problem Solving

A lot of people say that I'm not really a programmer. There are other things that are interesting to me, like data science, data visualization, and math. Maybe you're like me too: maybe you're comfortable writing whatever code to solve whatever problems you're having, whether it involves web programming, javascript, python, or just a single line of awk, but you don't have a "proper" background in computer science.

But recently I realized that this wasn't what they meant. What people really meant to say was that I am not an engineer. An engineer is someone who builds things, and building software quickly often means churning out a lot of code. There are many friends of mine who can do this very well, and it's very useful. Sure I can build things too, but not at their speed.

I'm a different kind of programmer. I'm a problem solver. I like to look at a nontrivial problem on software, and come up with ways to solve it with a minimal amount of code change. If you're looking at a really hard problem that people around you would write a lot of code to solve, and you wake up one morning with a much easier solution, then perhaps this is your style too.

A lot of programmers I know falls into one "type" or another. The two aren't really mutually exclusive (especially since good Engineering is supposed to involve good abstraction and thus less code). It's probably also determined by people's preference about the kind of things they like to do, how they approach problems, and what they find more rewarding. Building new things is really fun and writing lots of code makes you feel productive, while doing problem solving makes you feel clever when you count the lines of code you didn't write.

I happened to find the latter more rewarding. So, I'm a programmer who counts the lines of code that I didn't have to write.

End of Entry

Sunday, September 11, 2011

Summer 2011: personal reflection

To say that I had a wonderful time last term in San Francisco would be an understatement. I haven't been happier in all my life. Granted I haven't lived for a long time, but nevertheless it was a good term. The reasons it was good is pretty straightforward:

  • interesting problems to work on at work -- some things were like puzzles, where the solution came to me on a Saturday morning as I woke up,
  • side projects -- launching DataInColour was on the agenda for about half a year, and doing so while supporting DonorsChoose by hacking with the data was a great late evening/weekend project to be doing,
  • meeting people doing what I love -- going to data insight sf meant I finally had a chance to work with David and Fravic, and lead me back in touch with the person who originally taught me ggplot2 and seeing what's up at the Square office,
  • generally being active -- walking about 40-50 min per day, doing some hiking on the weekends, and learning how to swim probably helped in increasing happiness,
  • plenty of down time -- I slept a lot more than usual this term, and lazed around more than a certain friend of mine would be comfortable with,
  • and of course, being surrounded by great people -- no explanations needed here.
Trying to reproduce some of these in Waterloo would be difficult, but hopefully there will be enough going on at school to make this term just as fulfilling.

End of Entry

Wednesday, July 13, 2011

The Creative Process

This happens with about anything that I do that lasts more than a few days. The funny thing is that every time I press that "publish" button, something good happens. I honestly don't think that any of the work I published here, over at the Facebook data blog or at DataInColour is particularly great, but for some reason, posts that I am more hesitant to publish end up being the better ones.

(And yes, while working on the "Hacking Educations" series I did accidentally run "rm -rf *" while attempting to move code into git. It might have ended up being a good thing. People say that knowing what to remove is half of writing well, and I try to do that in my data posts: i.e. remove "random facts" that just aren't part of the story. In the "Hacking Educations" series and (less so) in the "Mining Jobmine" series, there were a lot of "random facts" than didn't really fit anywhere. Losing half of my code made it easier to remove those.)

PS: Sorry about the bad art and terrible "scanning".

End of Entry

Thursday, June 23, 2011

New Data Blog: Data in Colour

I've been waiting for a while to announce my new data blog, but here it is now.

All my data projects will now be posted there. Non-statsy posts will appear here as per usual.

End of Entry

Sunday, June 19, 2011

Chinese and Entropy

I happened to have a cold that day, and was trying to write a brief note to a friend canceling an appointment that day. I found that I couldn't remember how to write the character , as in da penti 打喷嚔 "to sneeze". I asked my three friends how to write the character, and to my surprise, all three of them simply shrugged in sheepish embarrassment. Not one of them could correctly produce the character. Now, Peking University is usually considered the "Harvard of China". Can you imagine three Ph.D. students in English at Harvard forgetting how to write the English word "sneeze"?? Yet this state of affairs is by no means uncommon in China. -- David Moser

I came across David Moser's essay on why Chinese is hard a while ago, and found it quite entertaining. Moser points out that Chinese is not only difficult for non-native speakers, it is also difficult for native-born Chinese as well. Among the reasons Moser thinks Chinese is hard are that the language is not phonetic, it has no alphabet, and it has a god-awful dictionary system.

All of these reasons are somewhat consequences to the fact that Chinese is much denser than English. I would hypothesize a Chinese text to have a higher entropy than its English equivalent (although this is probably pretty difficult to measure). Even in speech, one can convey the same information in fewer syllables in Chinese than in English.

I think one reason why Chinese is so terse is that historically, Chinese people commended terseness. Terseness is associated with wisdom. One can imagine followers of an old master asking a long list of questions, only to receive a single, one-syllable word in response.

End of Entry

Friday, April 15, 2011

Education Inflation, Technological Advancement, and Cognitive Surplus

The question of whether or not to pursue higher education is a relevant one among my almost-graduating peers. Especially with all the attention this issue has received recently, I can't help but engage in a bit of amateur speculation myself. Most of the articles on this issue study the nature of higher education directly. I don't wish to do that. Instead, I'd like to speculate about why it makes sense economically for there to be education inflation, and why education inflation might have an effect analogous to producing a generation of sitcom fans.

Education Inflation

The amount of education one needs to obtain a particular job has increased over the years. Positions that used to require high school level education now require college/university education. One can say that technological advances demand more skilled individuals in the workforce, but that's not the whole story: Asian countries such as China aren't exactly more technologically sophisticated than countries in North America, but in China one would need to have a PhD, have decent grades, and have spent at least a year studying abroad in order to even apply for an entry position at a bank.

So education inflation is real, and overpopulated Asian countries are having it much worse. This makes sense considering the natural explanation for education inflation that I always hear: as population increases, the number of candidates that apply for a given job increases, forcing employers to raise the bar on what is considered to be a "qualified" candidate. Admitting only candidates that have achieved a certain level of education is a quick and easy way of reducing the resume pile, albeit it is somewhat arbitrary.

This is a nice explanation, but I think we can dig deeper: why are the number of qualified candidates per job increasing?

Technological Advancement

Back in the day of subsistence farming, every person in a society needed to contribute his or her efforts in order for the group to survive. With technological advancement, one person's effort is now sufficient enough to feed many other people. The advancement allowed us to move from an agriculture-based economy to a service-based economy, meaning that instead of 50 people working on food production, 10 people can take care of food production while the 40 others find other ways to enrich each others' lives.

This is supposed to be a good thing. Technological advancement gives us, as a society, free time to do whatever we wish: build better shelter, design better clothing, invent the internet, speculate about the meaning of life, or even just take a break! Unfortunately, we have a different name for prolonged free time: unemployment.

We produce enough food to feed everyone, yet in order to live well and be respected, you need a job. In order to obtain a job, you need to satisfy an employer's stringent and sometimes arbitrary requirements. In order to do so, you must spend extra years of your life in school. Thus, as a society, instead of using the free time we gain from technological advancement to our advantage, we end up forcing people to spend this time in school pursuing education that they might not necessarily enjoy.

Cognitive Surplus

I first saw the idea that "free time" might be an important aspect of society from an article by Clay Shirky. Shirky hypothesized that as we initially introduced the five-day work week and 8-hour work days, we didn't know what to do with all the extra free time. As a consequence, we spent the extra time watching sitcoms. "The interesting thing about a surplus like that is that society doesn't know what to do with it at first" – he says. Shirky's article is about how sites like Wikipedia are finally taking advantage of the cognitive surplus, a resource that can do so much for us.

Back to the original question: when we don't need every single person in society to work to supply everyone's basic needs, what should we do with the extra time available? This is by no means an easy question to answer. I think that in some cases, pursuing higher education now is like what watching sitcoms were back in the day. It is society's first answer to the question, perhaps a decent first-order approximation but definitely not the best answer.

To be fair, the question of what to do with extra resources isn't a new question. We've had to answer this question many times over as our technology became better and better. We started by moving from an agricultural economy to an industrial economy to a service economy. I'd like to ask an economical historian whether the move was as smooth as we'd like to think it is.

Now what?

"Now what?" is the final thing the fishes say in the ending scene of "Finding Nemo" when they've finally escaped from the aquarium into the ocean. It's never an easy question to answer, yet it is one that many of us will need to answer in the next year or so. In this economy, higher education is the default answer, much like watching television was back in the 50's. After all, when we ask a senior student what they plan on doing after graduation, we expect to hear either the name of a prospective employer or that of a university.

Defaulting to a job or a masters program gives us a certain social standing along with a support structure and an illusion of certainty (some money or a degree at the end). It's certainly not the only choice, though. For those of us that can live off of savings or parents for the next few years, we have a tremendous amount of freedom to do whatever we want. Freedom in excess is terribly scary and has inherent risks, but if we can accept the uncertainty, we can work through it. We've done it before: we eventually watched less sitcoms and created Wikipedia. We can do something better here, too.

End of Entry

Friday, April 1, 2011

Canadian J1 visa data: sanity check failed

Being sick was an excuse for me to playing around with R and this dataset from If you're a Waterloo student and you did an internship in the US, you might find the below plot to not quite add up...

If I recall correctly, both in 2008 and 2009 the number of people attending the US Visa info session was well over 100. In fact, the summer 2009 contact sheet for Waterloo students interning in California alone contains 41 names -- that's NOT even including students doing their co-ops in other places in the US and in the fall/winter terms.

Alas, I wonder how trustworthy other numbers listed in this and other datasets are. It's too bad that we can't sanity check other datasets in a similar, direct fashion.

End of Entry

Monday, March 28, 2011

Emails from UW Presidents

End of Entry

Monday, February 14, 2011

"R Graph Cookbook" - a review

A publisher asked me to write a review for the book "R Graph Cookbook". Out of curiosity (and since they didn't mind if I wrote honestly), I complied. It ended up being a good decision, since this was a pretty easy reading, and there are things in base R graphics that I've missed since venturing into ggplot2-land.

The book is most appropriate to beginner (and early-intermediate) users of R and R graphics. It is very well structured, always with code for generating a graph preceding each graph, and explanations as to how everything works after the graph. The examples are relevant, and are quite easy to follow. While the author does not explicitly include an introduction to R, he explains the functions used in each code segment.

Earlier chapters go through the commonly-used charts, how to make them and and how to tweak certain parameters. The author goes very slowly over the different plots and gives many examples over different chapters. Later chapters cover recipes for more exotic plots (like 3d plots, maps, sparklines, and calendar maps), showing the true power of the R graphics, as well as the author's creativity. I don't think the author intended it to be read from the beginning to the end: the sections are very well structured, so it's pretty easy to flip to a random plot and look at how it's made.

One thing that I didn't like (which some people might like) is that most of the earlier chapters seemed like a puffed up version of the R documentations. While the advantage of having a book is that you get a more approachable and thorough overview of a subject, you can find information much more quickly by using a combination of google, online R graphics tutorials, and the R documentations. For example, if you know you want a bar plot, looking up the documentation for barplot() to see all tweak-able parameters is easier than waiting until chapter 5 to learn a few more of them.

The later chapters are more interesting: it really shows that R can do a lot, and it did give me ideas about how to visualize the data I am currently working with. Of course, the advanced materials are far from complete -- and it can't possibly be complete given the open nature of R: there are many, many packages in R, and the best thing to do if you're looking for something specific is just to use a search engine and read documentations.

So yeah, it's a really well-written book by someone who is definitely qualified to write it (Mittal started It's a book that does what it intends to do: good for beginners to flip through to get ideas, and well-suited for someone to whom the documentations are too intimidating. If you are at the stage where you're comfortable reading the documentations, though, then maybe about 30%-40% of the book could be helpful: you may still be inspired by some examples, though in that case I'd wonder if the book has enough content to be worthwhile.

So that's it. "R Graph Cookbook" by Hrishi V. Mittal. It was strange to find a bio of my friend Paul Butler on the list of reviewers, though that only affected the degree of my amusement and not how I saw the book.

End of Entry

Sunday, February 13, 2011

Another Pascal's Wager

What is your goal in life? It's a difficult question to answer, and if you haven't found an answer yet, let me propose a temporary one to you:
Live to grow. Live to be more mature, to be able to learn more quickly, and to be able to more quickly adapt to new situations. Treat life as training.
The reason is two-folds: for one, if you eventually decide to do something else with your life, then regardless of what it is, the growth you've gained will help you. The strengths/maturity you gain would make you more capable of achieving other things you want. You would have more and better tools to draw from.

The second reason is that it is fulfilling. Most people would say that they want to live to be happy, but there are different forms of happiness: there's happiness arising from everyday situations, and there's the deep sense of happiness you get from reflecting on who you are and what you do. Growing and becoming better includes being able to balance between the two types of happiness, and can be immensely fulfilling.

A third reason would apply if you are agnostic about whether or nor we would retain our consciousness after death. In the event that we do retain our consciousness, then who we are, the maturity and discipline of our thoughts, are the only things we might be able to take away (certainly we can't take away much else, including anything physical). Our maturity and cognitive discipline are the only things that might help us in tackling the new challenges that might lie ahead.

End of Entry

Thursday, February 10, 2011

Statistics, data mining, machine learning, and culture

"You'll realize that you need a dictionary to go between measure theory and probability theory. The underlying concept is the same, but the terminologies are different." - Kathryn Hare, PM354 Measure Theory

When I started learning about statistical techniques, I knew the textbook definition of statistics and data mining. The more I worked in these areas, the less clear the distinctions became. Then I met people who loved machine learning but didn't know statistics, and those who haven't heard of machine learning as a statistician.

After asking people and reading around, I got some partial answer as to how these fields differ. In short, doing statistics is like asking a multiple choice question, mostly with two choices (i.e. is this true, or not?), with more emphasis on using data points "efficiently", since getting data from experiments is expensive. Data mining is more like doing exploratory analysis on a big data set, usually collected for other purposes, without guarantee of any results. Machine learning deals with automating decisions to optimize something in real time, so there is a focus on iterative methods and on-line algorithms that can generate better predictions over time.

Overall, though, the techniques used in each of these areas are pretty much the same. However, there is a pretty important difference between them, and that difference lies in the culture of the people using the techniques.

For example,
statisticians are a very different breed of people compared to people in data mining and machine learning. When I think of the word "statistician", I still somehow think of an old, bald man in his PhD suit reviewing papers and writing reports for his consulting work. "Machine learning" on the other hand, has a quite different feel to it. I think of hackers, people who just want to get something cool working -- a book recommendation, a way to predict which ads you click on -- and whose method of "reporting" primary consists of shouting across the room. "Data mining" seems to fall somewhere in between, but I haven't met enough data miners to be sure.

No, I don't think all statisticians are old bald men (in fact my mom is a statistician, and she's neither old, bald, nor a man). I think there are very cool statisticians out there that do really interesting and useful research (e.g. mom). I do think that each fields tends to have its own distinct culture, just as each company, school, or any non-random congregation of people would.

The culture of people in different fields is something pretty important to think about when we decide what to do with our lives. There are many, many interesting fields out there, and choosing one that is an epsilon "more interesting" than the others is not as fruitful as understanding the culture of the people in these fields: how they do work, how they collaborate, and what they are generally like.

End of Entry

Tuesday, January 25, 2011

Choosing majors, how I got it wrong

When I went into university, I knew exactly what I wanted to do: double major in Pure Math and Actuarial Science, finish as many actuarial exams as I can (got one done in high school), and have both good background theoretical knowledge, decent practical knowledge, and some experience -- all before I graduate. Perfect plan. Alas, I hated my position as an Actuarial Intern at Towers Watson (called Towers Perrin at the time), so it was back to square one: I had no idea what my major should be.

My reason for majoring in Pure Math was that out of everything that was interesting to me, it was the most difficult thing for me to learn. I precluded CS as a potential major, mostly because practical programming knowledge was something I've seen people picking up without formal training. Also, the first few CS courses were terrible: extremely slow lectures, slides, profs going on and on about "The Design Recipe" to make sure you format your comments a certain way... Granted these are junior level CS courses, but I didn't have the patience to see what upper year CS courses are like. (Actually, I tried to sit in on a second year CS course. I walked in as the prof was writing down all the powers of 2 on the board to explain the difference between a 'byte', 'kilobyte' and 'megabyte'. The students dutifully copied it down in their notes. I left.)

I decided not to major in Statistics for the same reason. It would be more fun to learn it on my own. The first two courses in statistics were not impressive, either, and I didn't want the structure of courses to kill my interest.

Combinatorics and Optimization would have been an interesting choice. I think I decided against it because the name was too long.*

So I chose Pure Math and Applied Math. Again, Pure Math was an obvious choice, just because it's so fun. Why I chose Applied Math is still a mystery for me. It was probably the only thing left on my list. (Maybe also something to do with physics being important and understanding dynamic systems being potentially beneficial?)

Regardless, I thought this was fine. I'd be learning what I want the way I want.

Unfortunately, I missed one important thing.

Malcolm Gladwell talks about what distinguishes experts from non-experts. He found that what sets experts apart is the enormous amount of time they have spent learning and practicing. He claims that the magic number of hour of practice that would make you an expert is 10,000 hours. That's a lot of hours.

Yet if you choose the right field, the hours you spend listening to lectures, doing assignments, and solving exam problems count as hours learning/practicing; they count against the 10,000 hours. This means that if you choose a suitable major that is well aligned with your goals, then it would help you in getting closer to becoming an expert: regardless of how bad the prof is or how terrible the lectures are, you are going to spend more time on the subjects covered in the course. In the grand scheme of things, the bad profs and bad structure are not as important, just because having those hours under your belt is what really counts.

This is probably something very obvious for most people -- I hope it is, and that you made better choices than me. I suppose had way more skepticism for post-secondary education than appropriate.

*Long story. I was trying to fit a minor in some how, but heard some rumour about how only certain number of characters would fit on your degree, and decided not to risk it. Make sense, doesn't it?

End of Entry

Monday, January 24, 2011

criticisms, revisisted

In philosophy class, we discussed the objections and replies to the Meditations by Descartes. One set of objections were written by Hobbes, and another by Arnauld. Hobbes listed many objections, as if he was scrutinizing the text, looking for any little error he could possibly find. Arnauld had three main objections, one of which he wrote in great length.

The professor pointed out (jokingly) how Hobbes was almost like an inexperienced grader, who went through a paper highlighting all the tiny mistakes without taking a step back to look at the big picture. Arnauld, on the other hand, must have given Descartes the benefit of the doubt whenever he could, and included only the most important objections.

It really does takes experience and good will to give people the benefit of the doubt, to notice that while the t's aren't crossed or the i's aren't dotted, there is an interesting idea somewhere worth taking seriously.

This is an interesting distinction between a beginner and an expert. The professor pointed out how most people begin studying philosophy with a critical eye; they try to break apart any argument they see. While this isn't necessarily a bad thing (and can be fun), more experienced people focus more on trying to understand the big idea being presented by various philosophers, which I imagine would be more fulfilling.

I've talked about the same concept in relation to music. I wonder if this is true in areas outside philosophy/reading papers, if this is some characteristic behavior of a beginner.

End of Entry


I had written this up after talking to Lin, who asked me about my resolutions, but forgot to publish it. Since it's kind of a tradition for me to list my resolutions on this blog, here they are.

Last year (2010) was an interesting year. I can't really say if it had been good or not, but it was interesting. I didn't consciously keep last year's resolution of "focus on what's important" as close to heart as I should have. It is a very open-ended resolution, so it's pretty difficult to judge how well things went. I also had a tremendous amount of luck: a lot of micro-decisions turned out well, I found the type of work that drives me, I learned a lot, and I made friends in the process.

I had another resolution last year, which was to declare my major. I did. My majors are Pure Math and Applied Math.

For 2011 I have two resolutions lined up. The two are related, and they are even more basic and more open-ended than the ones I had before:

(1) Don't do what I know is wrong.
(2) Do what I think is right.

These two things sound simple, yet those who can abide by them truly inspire me. Actually, of all the people I know, I'm really only certain of one person who does these. I really wish to be able to live up to this person sooner rather than later. (You know who you are.)

The two resolution also sound the same, but they're not. (1) is a subset of (2), and is a much more realistic and less scary target. On the other hand, (2) completes (1); it's what I really should be doing. Hopefully I'll remember both of these for the rest of the year and truly live by them. (So far I'm doing okay. Not perfect, but okay.)

Memorable moments of 2010? Landlord. Buddhist temple. Blogging on Blackberry at Exploratorium on Valentine's day. Conspiracy theories. Grocery trips. Poking ringworms. Being mad at a friend. Forgetting a fellow intern's name in front of >200 people. Failing complex midterm. Logistic Regression. Hiking. TMI. Hiking at TMI. Putnam at Stanford. More hiking. Secret project. Causes. Hotels. Many Hackathons...

End of Entry

Monday, January 10, 2011

Our launch...

I suppose I had a pretty naive view of what a small project launch would feel like.

End of Entry