Friday, October 26, 2018

My favourite public speaker

I tend to agree with Paul Graham's blog post [1] about how easy it is for entertaining talks to have little substance at all. For example, I showed this TED talk [2] in a lecture to illustrate the power of posture, voice and body language in crafting a talk devoid of real content. The talk manages to be extremely entertaining. However, I'm secretly afraid that talks like [2] are entertaining not despite a lack of content, but because of it.

I think that really good public speakers manage to be easy to listen to despite delivering real, hard content. Digesting facts and understanding arguments is hard work, and a good public speaker should make it easy for the audience to really learn something new.

That's why my favourite speaker during undergrad was the late Hans Rosling, who had a knack for telling the stories behind that data in an entertaining way. Rosling's talks were often about the public's misconceptions about the world, about how our view of the world was outdated. He began his talks with humorous anecdotes, then delivered his message by taking his audience close to the data with a contagious enthusiasm. The choice of data visualization was most important, and can be anything from an animation showing changing income distributions [3] to stacked toilet paper rolls showing population growth [4].

A great computer science speaker is Gary Bernhardt from Destroy All Software [5]. If you are old enough, you will probably remember his "Wat" talk [6], and his less hilarious but more insightful talk on software Boundaries [7]. In his talks, Bernhardt introduces concepts in a meaningful order, and explains even complex concepts in a clear and concise manner.

I can't finish this blog post without mentioning Bret Victor [8] and Andrew Ng [9], both experts in their chosen fields. Their fields and their styles could not be any more different. But both of them tend to choose other mediums to communicate their ideas, and both their talks tend to be less entertaining than they are mindblowing. Both these people find ways to distill their ideas down to their essence, then explains them in simple words. Even though they may not make as many jokes as Berhardt, Rosling, or the talks Paul Graham refers to in his blog post, Victor and Ng's talks are worth watching.

This blog post was written so that I could follow my own prompt at



Thursday, September 20, 2018

Candy Japan's A/B Testing

The idea behind A/B testing is very simple: randomly assign users to groups A or B, change one thing between the groups, then see whether the change affects some metric of interest.

Candy Japan recently wrote an article on "A/B testing how to ask YouTubers for product reviews". I really like articles like these, where an author breaks down a real experiment and discuss what they learned. In this case, the experiment was emailing Youtubers to see whether they would be interested in making an "unboxing" video to review Candy Japan's product. The author varied the words used in the email and analyzed how the variations affected the rate of a positive response. In total, 180 YouTubers were emailed.

Considering the amount of effort required to message each YouTuber, the author took the opportunity to A/B test several different "splits" simultaneously. The author, unfortunately, does not discuss his methodology regarding the splits, so I'll assume that each split was even and independent of the others.

Some of the results presented were great. For example, including a call to action:
If you would like to receive a review box, please let me know your mailing address.
had the most impact, with the message reducing positive responses by more than 10%. It is a well-known fact in advertising technology that including the price of an item decreases the click-through rate. Here, asking for something as sensitive as a person's mailing address in the first email can come off as creepy.

I did find some of the other results less than convincing. There was only a less than 2% difference in positive response rates difference between including the following "elevator pitch" vs not.
I run a site called Candy Japan, which sends surprise boxes of Japanese sweets to members twice a month.
Since the 180 YouTubers would need to be split into two groups, with each group having around 90 YouTubers, the difference in positive responses between the two groups must be no more than 2. This is a very small difference, too small to justify even the author's toned-down conclusion that "including an elevator pitch of your service may help."

Moreover, the difference between offering viewers a discount vs not:
I can also give your viewers a discount coupon.
was also around 1-2%. It is interesting (but understandable) that the author's takeaway there was that discounts don't really matter -- at least for getting YouTubers interested in making a video. Of course, including a discount might encourage video viewers to become new CandyJapan customers, which is the real goal.

I'm as surprised as the author that the positive response rate was over a quarter. The success is a testament to the author's effort in targetting the right channels. I wonder how results would change if he targetted a wilder group of YouTubers, without the initial selection. I also wonder how he assigned channels into groups, and whether there were any correlations between the groups. Experiments like this are so difficult because there are so many features to test, and getting a large sample size is a lot of work.

The author promised a part 2, to test whether this method of advertising yields better results than buying YouTube ads. I'm very curious to see the results.

This blog post was written so that I could follow my own prompt at 

Tuesday, September 18, 2018

A (re)introduction

It is surreal that this blog has been around for almost a decade. I'm almost afraid to write here, lest someone dig up something unexpected from my undergraduate days. (Please don't. I don't know how I got away with such terrible writing.)

The reason I'm back is because I am teaching a communications skills course to undergraduate computer science students. I am asking the students to write weekly blog posts, and figured I should follow my own prompts once in a while. I want to feel the same blank screen and blinking cursors as my students.

So here I am, following this week's prompt to introduce myself to everyone in the class.

I am Lisa, currently an "Assistant Professor, Teaching Stream (CLTA)" at the University of Toronto Mississauga. The term "Teaching Stream" means that I am what is traditionally called a "Lecturer". The acronym "CLTA" means that I'll be at UTM for the next 2 years.

My path to teaching was long, winding, and full of surprises. I was once on a roller coaster called a startup. I spend several years as a data scientist, building models to make people click on ads. I published a few papers during my masters, and learned to write (more) properly. I hope that traversing the winding path of life made me a better teacher, so I that can bring together startup, industry, and academic experience to the courses I teach.

I told my students to email me if any part of my background interests them. A few students took me up on the offer. They asked, for example, about how to get an internship at Facebook. Maybe I'll share some advice here too, someday. Until then, email me with your questions about getting into grad school, getting started with machine learning, and applying for internships! It is much easier for me to share resources via email than in person.

The truth is, I never expected teaching to be a possible career option. Pursuing this career is more risky than one might realize. To add to the fun, I don't have a PhD and don't (yet) intend to get one. But what would life be if we don't take chances to do what we find meaningful?

So, if you know of universities that are hiring full time teaching staff in Computer Science, especially Machine Learning, please let me know.

Tuesday, November 8, 2016

The Fundamentals

What changed me from an amateur into a professional was getting a really firm grip on the fundamentals -- Toshiro Kageyama (7d Professional Go Player)

It was pretty much the only advise I remember my dad repeatedly give me, both before starting undergrad and before starting my masters just a few months ago: focus on the fundamentals. He may have said other things, but focus on the fundamentals is the only phrase that really stuck.

You might have heard of the story of Da Vinci and the egg that illustrates this concept:
It has been said that when Da Vinci first came to Verrocchio’s workshop, he was told to draw eggs. Day after day, Da Vinci was told to draw eggs all the time. One day, Da Vinci finally got tired of drawing eggs so that he came to his master Verrocchio for complaint. However, his master explained to him of profound significance “Drawing eggs is not a simple thing to do, even for the same egg, if you change the observation angle, the light will change as well, and you will find the different shape of it.” Da Vinci suddenly understood the purpose of his master. After then, Da Vinci accepted drawing eggs with an open mind which actually helped him built the foundation of further achievement.
The fundamentals can often be less interesting than whatever is new and shiny. It requires patience and honest self-assessment about how much one actually understands. It's always so tempting to "move fast and break things". But a firm grasp on the fundamentals is necessary to be able to intuit deep connections and do meaningful work.

There's a second part to the argument for focusing on the fundamentals. Popular research will move on, and that whatever is new and shiny now will cease to be important in a few years. The skills that will stay relevant for a long time will end up being the fundamentals, the things that won't change or go out of favour. This is even more crucial in a field like machine learning that moves lightning fast.

But what exactly are the fundamentals of machine learning? There are the obvious tools like linear algebra and multivariate calculus. There's regression and its generalizations, gradient descent and its second-order extensions, back-propagations and the like. What about all the types of neural networks with the many acronyms like CNN, RNN, LSTM, ...? At what point do we break away from the fundamentals and find ourselves in the arena of popular research, the kind of things that will cease to be important in a few years?

What's been smelling most like fundamentals in the last few months have been variational inference and variational autoencoders (VAE's). They come up everywhere in recent research, but VAE's were only introduced in 2014!

It's possible that something else will take its place in another few years. If so, is it still a fundamental? Perhaps machine learning is just such a young field that the fundamentals are still being built?

Thursday, October 20, 2016

Less boring

You've become boring. -- Jee (~mid 2015)

The truth is, I had become boring. I had become boring because happiness is boring and comfort is boring. The formula is really quite simple: a loving relationship, engaging and well-paid work, a nice place to live, good health and lack of serious responsibility. It's all one can ask for.

So, instead, here I am, back in school to study Machine Learning.

...and then when you graduate, you can get a job as a Senior Data Scientist! Oh wait... -- Matt (May 2016)

It might not make sense financially. It might not make sense as a career move. It might not even make me a "better" person.

All I can say is, it's making me less boring.

I might even have more to say now.

I might even say it here.