Thursday, July 29, 2010

Mining Jobmine: Part 3. From the Employer's Perspective

Recently, Paul asked the question of what would make his resume more effective. I now ask a very similar question from the employers point of view: What can employers do to make job postings more effective? While AB Testing job postings is not an option for me, it is possible to look at Jobmine data to find attributes of job postings that correlate with number of applications.

Keep job postings short

There is a negative correlation between the number of words in a job posting, and application rate. This correlation is very small, but still statistically significant. Below is a smoothed scatter plot of words per job postings vs. applications (darker colours mean a denser packing of points), with a curve of best fit [1].

The curve implies a loss of about one application for every 50-60 words added. Again, the decrease is slight, and the length of a job description explains very little of the variation amongst application rates. This is not surprising: many factors affect application rate of a job, such as the actual job, and we expect the effect of the length of a job description to be minor compared to more important factors.

To uncover other subtle factors affecting application rate, I tried a technique I learned at Facebook: for each job posting, I calculated the percentage of words used in each of the approximately 100 word categories in Harvard’s General Inquirer dictionary (e.g. percentage of positive words, food-related words, law-related words, etc). While this method did not yield as much insight as I had hoped [2], there was one interesting observation...

Talk about the company, not the candidate

There is a negative correlation between “you” pronouns (“you”, “your”, etc) and application, and a weaker positive correlation between “our” pronouns (“we”, “our”, etc) and application. This makes some sense: perhaps students enjoy reading about what a potential employer is like, rather than about what they must do or be. Perhaps seeing someone say that "you should have a solid knowledge of spreadsheet applications" is taken to be a bit aggressive. Incidentally, there is a negative correlation between “ought” words (“must”, “should”, etc) and application.

The word “you” came back again when I analyzed the correlations between application rates and the appearance [3] or increased use [4] of individual words (as opposed to word groups). Indeed there is a negative correlation between repeated use of the word “you” and application.

Good words, bad words

Several other words are correlated with application rates. Here are some words whose appearance or increased use is positively correlated with application rates:
Analysis, Capital, Construction, Design, Electrical, Energy, Engineers, Engineering, Excel, Mechanical, Projects, Toronto
Many of these words relate to the previous parts of “Mining Jobmine”, as they identify fields in low supply or high demand (which are apparently finance and engineering, especially mechanical engineering), and places that Waterloo students want to be (well, Toronto...). I’m not sure how to interpret the word “projects”.

As for words whose appearances are negatively correlated with application rates [5], there are actually more of these than "positive" words. Below is a partial list consisting of the most statistically significant words.
Application, Community, Development, Framework, fulltime, hours, HTML, Java, need, .NET, open, planning, Server, SQL, title, Unix, users, Web, Windows, within, XML
Again, the programming words in this list suggest that programming jobs are in low demand or high supply. Other words are hard to interpret: should employers refrain from talking about its hours, its fulltime employees, or about its users’ needs? Perhaps some of these correlations are spurious.

Junior, Intermediate, AND Senior

Each job posting on Jobmine has one or more “level” tags associated with it: Junior, Intermediate, and Senior. These tags describe the “level” of students that an employer seeks, and are used by students to search for jobs appropriate to their level. The plot below shows the mean application rates (and 95% confidence interval) of jobs with each set of tags, with the red line showing the mean application over all jobs.

In most cases, adding an extra “level” tag increases application rates by about 10. Adding an extra “level” tag would mean that more students are likely to see your job. The exceptions are, of course, those 7 jobs that are tagged Junior and Senior...

Avoid special instructions

Special instructions are red-coloured messages that appear above a job description in Jobmine. Employers use it to announce information sessions, to remind students to apply through their website, or for other reasons. Around 40% of job postings on Jobmine have special instructions, and these postings receive 6 fewer applications on average than postings without special instructions. This is quite a large difference - and statistically significant, too. Perhaps the contents of special messages turn applicants away? Perhaps people don’t like seeing big bright red messages when reading a job posting? Either way, including special instructions might have drawbacks that employers do not expect.


While most students spend hours perfecting their resumes, employers don’t always think as much about job descriptions. Yet these analyses show that a student’s decision to apply for a job can be influenced by factors other than the job itself. Some of these influences are marginal, while others are large. The analyses suggest that employers can increase the candidate pool by shortening job postings, rewording job descriptions, or by being cautious about using special instructions. Of course, an employer’s end goal is to find a suitable candidate, and so the quality of the candidate pool is more important than its size. Whether or not improving a job description is worth an employer’s time is another story -- especially since the effects of changing an individual job posting are uncertain.


[1] Application numbers are heavily skewed, so to satisfy the assumptions of the linear regression model we take the square root of application rate as our dependent variable. Number of words in a job posting is still our independent variable, and the curve we get is a quadratic.
[2] Several word categories showed statistically significant correlations with applications, but these correlations are hard to interpret because many word categories are filled with homonyms and questionable words. For example, the category “Land” contains words describing places occurring in nature, and is correlated with applications. However top words contained in this category are “field”, “range”, “bank” and “fall”. As another example, the words “time”, “service” and “fun” are considered “hostile” words in General Inquirer.
[3] To test the effect of the appearance of a word, I split up the jobs based on whether or not a particular word appeared in its job description, and used a two-sample non-paired t-test. Very uncommon words or very common words were ignored.
[4] To test the effect of the number of appearances of a word, I correlated the number of times a word appears in a job description and application, and calculated the p-value. This analysis was done only on words that appear more than 10 times in at least one job posting.
[5] All of these words are significant when [3] is applied to them.