How to Apply to a Computational Biology PhD

Intro

The process of applying to computational biology PhD programs is surprisingly opaque. You can google “how to apply to PhD programs,” and you’ll get several hits. However, a surprising amount of the advice is too abstract (“be true to yourself”) or not applicable to computational biology programs (“ask professors whether they have openings in their lab”). In this post I hope to record the advice I’ve read or learned along the way and combine all the information together in one place.

This advice is aimed at people in the US planning on applying to graduate schools in the states. Many of the research opportunities I discuss, such as SURFs, REUs, and postbacs, are funded by the government and have citizenship or permanent residency requirements. The PhD system is also different in other countries. If you want to apply to a PhD program in Canada, for example, you’re expected to either have a master’s degree or apply to a non-terminal master’s program that transitions into a PhD. I didn’t apply to many schools outside the states, so I’m unqualified to advise what proper preparation for doing so would look like.

As always, comments, questions, concerns, and theories about the end of Inception can be directed to my Twitter.

Intro
Table of Contents
Terms to Know
Should You Apply to a PhD?
Getting Research Experience
Deciding Where to Apply
- Reading Papers
Timeline
How to Apply
Other considerations
Conclusion

Terms to Know

PI - Short for Principal Investigator. Not all research is done in academic labs, so not all lead researchers are professors. The term PI is used to be more inclusive¹.
CV - Short for curriculum vitae. Academics use “CV” instead of “resume” to impress people with their knowledge of Latin². There are some minor style differences, though. For example, it is a faux pas to have a resume longer than one page, but that’s not the case with a CV. As a grad school applicant your CV may still be a one-pager, but that’s because much of the length of a CV comes from publications, grants, and awards.
Rotation - Many grad schools in computational biology are rotation-based. That is to say that instead of applying to a lab, you apply to the program as a whole. Once in the program, you’ll spend a few months in multiple labs to get an idea of what you like before joining your thesis lab.
Umbrella program - Umbrella programs combine multiple fields into one department to promote interdisciplinary work. Depending on what programs you select, you may end up applying to an umbrella program rather than one specific to computational biology. Yale’s BBS program is one example of how umbrella programs can work.
Other names for computational biology - If you are going to google keywords to see which programs are out there, you should also try searching for bioinformatics, biomedical informatics, genomics, and systems biology programs. There are differences between these fields, but if you don’t already know what those differences are you may find interesting research in those fields that you wouldn’t find by just searching computational biology.

Should You Apply to a PhD?

Before you apply, you should think about whether you really want to do a PhD. I’ve already written up some reasons why you shouldn’t do a PhD and why you should. The tl;dr is that you should do a PhD if you want the freedom to work on complex problems with uncertain solutions, and you are willing to accept the financial and emotional difficulties of doing so.

Getting Research Experience

If you’re going to apply to a PhD program, you should probably have some previous research experience. Frequently you can do projects during undergrad, but if you’re attending a smaller university or one that isn’t research-focused, it can be hard to find a computational biology research project that interests you. If you can’t find a project you like, or decide to do a PhD late in your undergrad career, you still have options.

Summer Internships

As an undergrad you can get a paid research internship working in a lab for a summer. In my (biased) opinion, the NIH summer internship program is excellent because of the variety of research that goes on at the NIH. There are also summer internship programs hosted by universities. PathwaysToScience’s List is a good place to start, though it isn’t comprehensive. Some search terms you’ll want to try are REU, SURF, and SURP (short for research experience for undergraduates, summer undergraduate research fellowship, and summer undergraduate research program, respectively). If you are trying to decide whether you enjoy computational biology research, you might also want to check out the GCB Summer CompBio Preview.

Postbacs

You are typically not eligible to do REUs/SURFs after you graduate. There is a similar type of program called a postbac (short for Postbaccalaureate Research Program), though. These are typically paid positions lasting one or two years designed to prepare you for grad school. The NIH has several of these positions, and some universities host them too.

Master’s Degrees

Lots of people do pure biology coursework in their undergrad and want to focus more on computational analysis in grad school (or vice versa). If that describes you, it may be helpful to do a master’s degree in the field you want to go into.

You shouldn’t think that a master’s is required, though. Computational biology programs frequently have people apply with entirely biological or computational backgrounds.

Working for a Lab

You could also look for jobs working in a lab you want to apply to. Working as a research analyst or lab technician might be helpful, as you’ll get an idea of exactly what the day-to-day research life looks like.

You can also volunteer to help with research in a lab. Getting a volunteer position in a particular lab is probably easier than getting a paid one since you’d have to apply at the same time that that lab is hiring. That being said, I hesitate to recommend volunteering. You have skills that are worth money, and there is enough unpaid work going on in academia as it is.

Deciding Where to Apply

I didn’t have any idea where I wanted to do a PhD until pretty late in my undergrad career. I knew I wanted to apply machine learning to biology but didn’t know what schools actually did that. At first I tried looking at professors’ websites, but they’re pretty vague and almost uniformly out of date. Eventually I bit the bullet and read academic papers.

Reading Papers

Reading academic papers is a skill³ that you may not have yet. Worse, there are guides for learning how to read papers, but they don’t focus on how to read papers to find schools to apply to.

Your goal is to find which labs are doing research you’re interested in, not what the state of the art in a field is. As a result, you’ll focus more on abstracts and author affiliations than on results and conclusions. Here’s an example workflow for how to find labs through papers:

Decide on a research topic you’re interested in.
Search for a review article⁴ on the topic (Google Scholar or other academic search engines will work better than vanilla Google.
Read through the article and record the papers it cites that sound interesting to you.
Find the papers that you liked from step 3⁵.
Read the abstract of each paper to see if it still sounds interesting
If so, take note of the authors, especially the first and last ones⁶.
Look into the institutions and labs of the authors whose research you liked.
Repeat from step 1 or step 2.

Working through these papers will take a while, so don’t get discouraged. At the end, you’ll have a list of research topics you find interesting and PIs you might want to work with. Aim to apply to universities with more than one faculty member on your list. You (and they) don’t always know whether they will have open spots when the time comes for you to pick a lab, so it’s important to avoid having a single point of failure.

Timeline

The PhD admissions cycle occurs yearly and has four main parts: applications, interviews, school acceptances, and student decisions, which can be found in the timeline below:

Apply: Deadlines for PhD program applications typically fall around December 1.

Interview: Interviews take place from late January to mid-March. Each school will only have one or two interview weekends, so if you get interviews at several schools you may have to drop some schools early.

School Acceptances: Schools will typically let you know whether you’ve been accepted into the program a few weeks after their interviews.

Student Decisions: After interview season you will have until April 15 to decide which school you want to attend.

How to Apply

Now that you know where you want to apply and when you need to apply, you can start working on actually applying. The main application materials that you’ll need to prepare for each school are (in order of importance) letters of recommendation, a personal statement, a CV, and (maybe) GRE scores.

The primary thing to keep in your mind as you’re putting together these materials is that admissions committees are trying to determine whether they expect you to do good research. As a result, each of your application items should be tailored to show that (1) you’re a good researcher and (2) the research that you want to do fits into their program.

Rec Letters

Letters of recommendation are arguably the most important part of a PhD application, though they are also the one that you have the least control over. Recommendations are heavily weighted because your admissions committee will be made up of professors trying to determine whether you will be a good researcher. A strong letter of recommendation is another professional researcher (who members of the committee might even know) telling them that you are, in fact, good at doing research. Accordingly, your letters of recommendation should be from professors, ideally ones you have done research with.

You won’t be writing your letters of recommendation, so there isn’t much advice to give here. This guide on how to ask recommenders to write you a letter may come in handy, though.

Personal Statement

Computational biology programs draw in people from a diverse set of academic backgrounds. People with degrees in biology, computer science, math, and other domains all end up applying to the same programs. Your personal statement is your opportunity to tell the admissions committee why you, specifically, would be an asset to their program.

In terms of content, personal statements are really research statements. I think Matt Might explains this well:

‘Personal statement’ is a terrible name for this document, because it confuses applicants. Use this statement to answer the following question in essay form: ‘Why should we, the admissions committee, believe that you, the applicant, have the potential do research in field X?’ and ‘What kind of research could you see yourself doing and why?’

I think this post also does a good job of addressing what you should be thinking about when you write your statement:

Why do you want to complete further research in this field?

Why have you chosen to apply to this particular university?

What are your strengths?

What are your transferable skills?

How does this program align with your career goals?

The only thing I’d add is that personal statements should also address any questions you feel your application might raise. If you have a low GPA or something else you feel may cast you in a negative light, the personal statement is an opportunity to explain why those setbacks won’t get in the way of your graduate studies.

While the content of a personal statement varies from person to person (by definition), I can give some advice on the mechanics of writing one. First, write a strong template personal statement before anything else. Details will change between the individual programs, but your format doesn’t always need to. Once you’ve written a compelling account of your research experiences and motivation to study computational biology, you can change small details without having to rewrite the whole document ten times.

Second, you should tailor that template statement for each school you apply to. Every program will have a different focus. If your personal statement talks about how much you enjoy testing therapies on mouse models, a program that focuses on analyzing medical records will be unimpressed. A statement about how you loved creating a program to discover the way populations of mice responded differently to treatment and how you think similar methods could be applied in a hospital setting would go over better.

The focus stated on the program’s webpage may be vague or unclear. If that’s the case, you can aim towards the research focus of the labs you’re interested in joining.

Ultimately, your personal statement is the thing you have the most control over in your application. Be sure to spend the time it takes to write a strong statement.

CV/Resume

Your CV format will mostly look the same across schools you’re applying to. You’ll want to include your name and contact information at the top, along with sections about your education, research experience, publications, and awards (in roughly that order). Some guides will tell you to include a mission statement/personal statement/profile at the top of your CV. I’m skeptical that doing so is ever useful, but it is particularly redundant in grad school applications since you’ll be writing a long-form personal statement anyway. For an example of what a CV should look like, check the websites of faculty members you found in the deciding where to apply section for their CVs.

You can tailor your CV to each school for maximal effect by modifying your descriptions of your research experience. For example, part of my research experience section when I applied to computational biology programs looked like this:

Summer Fellow - NIH
May 2016 - Aug 2016

Analyzed and improved genetic comparison software

Worked with terabytes of data on a 200+ node compute cluster

Used Bash, Python, and R to develop a quality control pipeline for metagenomic samples

If I had been applying to a program more focused on, e.g., population genetics, I would have focused less on the computation and more on the genetics aspect:

Summer Fellow - NIH
May 2016 - Aug 2016

Analyzed linkage disequilibrium to develop a quality control pipeline for metagenomic samples

Measured similarity of LD blocks across related and unrelated individuals

Worked with data across several 1000 Genomes Project populations

The two versions of the subsection reflect different aspects of the project I worked on. They’re both equally true, but they say different things about what I’m interested in and capable of doing in my research. By making similar small modifications to your CV, you can ensure that you’re communicating how cool your research was in ways the admissions committee will appreciate.

The GRE

There’s not much to say here. The GRE is something that some programs care about a little bit, so studying to get a good score is probably worth doing. However, many schools don’t require the GRE anymore, so check with the programs you’re applying to before you shell out the money to take it.

Other considerations

Rejection

You will get rejected (a lot). Rejection is an unfortunate fact of life, and it hurts worse when you put in a lot of time and effort as in grad school apps.

Few schools publish their PhD admissions statistics, and when they do the results don’t tend to be at per-program resolution. However, based on the numbers I’ve found, acceptance rates tend to run around 10-20 percent.

Schools’ low acceptance rates have two main implications. First, when (not if) you get rejected, you should understand that it’s due to bad luck, not a failure on your part. Second, you should apply to several schools.

How Many Schools to Apply To

Based on the low acceptance rates mentioned above, applying to 5-10 schools has an expected value of one acceptance⁷.

The optimal number of schools to apply to depends on a number of factors. First, if you have a niche topic you want to research, there may not be many schools with faculty doing research in that area. I wouldn’t recommend applying to programs that aren’t doing research you’re interested in. Six years is a long time to do something you don’t love, especially if you’re not getting paid much.

Second, you only have so much time to write your application materials. Hopefully you won’t have to change much between schools, but refitting your personal statement and refilling each program’s artisanal transcript form eats up a lot of time. Budgeting your time and focusing on schools you want to attend will yield better results than blanketing all the PhD programs you can find with applications.

Finally, you have to factor in your risk tolerance and your best alternative to a PhD program. If you end application season with zero acceptances, you’re going to have to wait a year to reapply. If that is unacceptable to you for one reason or another, then you’ll want to spend more time finding programs you’re interested in and applying to them.

In terms of concrete numbers, I’d say a reasonable number of schools to apply to is between 5 and 12 (inclusive). If you’re on the lower end you run the risk of not getting into any programs even if you’re a well-qualified applicant, and you run into diminishing returns on the high end.

My application process looked something like this: I was interested in the intersection of deep learning and biology, and I couldn’t find many faculty members in my paper search who were doing deep learning at the time. I ultimately found five schools with multiple faculty members I wanted to work with. I then spent my fall putting together applications for those schools, got an interview at one program, and ended up with an offer that I accepted.

I knew going in that my application number was on the low end. I love writing code, so I was perfectly willing to spend a few years as a software engineer until I decided to apply again. If I had been set on being a professor instead, there would be no alternative to grad school. In that case, I would have written more applications to ensure that I could get in somewhere.

Finances

Grad school applications are expensive, especially when you’re applying for several programs. If the fees are enough of a burden that you’re not applying to schools that you would like to, then you should apply for fee waivers. Every school I’ve seen has a fee waiver system, and some will even automatically waive your fee if you’re a member of pathway programs. There are also waivers for GRE costs (though they only give you a 50% off coupon).

Most PhD programs in comp bio will pay you money (but not much) to do research. However, some programs may offer unfunded PhD positions. I believe this is less common in computational biology than fields with less funding, but it’s something to be aware of. I’m not very familiar with the cost-benefit analysis of doing an unfunded PhD except that I don’t generally endorse doing unpaid labor for years. For a more informed/well-reasoned take, see this article by a person who actually did an unfunded PhD.

Miscellaneous

Lots of articles online will encourage you to email professors who you want to work with during the application process. Don’t do this (or at least don’t expect it to factor into your admission chances). In any school with a rotation system individual professors won’t have a say on your admission. You will join a program, not their individual lab.
You should have someone proofread your application materials, especially your personal statement. Ideally that person would be a professor you work with or a friend in the same field, but there are other options. Many universities have some version of a Career and Professional Development department with people who look over application materials as part of their job. At my undergrad the English department hosted a writing center where you could book an appointment to have a grad student proofread applications. Resources are out there, so be sure to look for them.
There is a website named after a place to consume coffee that discusses grad school admissions. For your mental health, I recommend never going there.

Conclusion

Applying for PhDs is hard. Hopefully this post can save you time and alleviate some of the stress of application season.

Remember that there’s a light at the end of the tunnel. Interview season is as fun as application season is stressful⁸. Focus on the process, not the outcome, and everything will be fine one way or another.

Acknowledgements

A huge thank you to the NIH OITE staff (especially Drs. Milgram and Sokolove), who were great teachers of the grad school application process.

Footnotes

Technically PI also has a financial dimension in that it denotes the person who is receiving the grant funds for a project. ↩
For the same reason we say “et al.” instead of “and others”, “corrigendum” instead of “correction”, and “academia” instead of “those trees where Plato taught back in the day” ↩
I didn’t believe reading papers was a skill when I was an undergrad. My opinion was “Reading is a skill, but I already know how to read”. As a result I was frustrated with myself for being bad at something I’m very good at (reading) rather than something I was entirely inexperienced at (reading academic literature). Don’t fall into the same trap I did. Recognize that reading papers is its own skill, and train your ability to do it by practicing. ↩
A review article is a type of paper that summarizes several papers from an area of research. Reviews are ideal starting points in your search because they cover many labs’ science. ↩
If you don’t have a university affiliation, you should probably be aware of Scihub. It’s a website that can provide access to most scientific literature. Not that you should use it, of course, it is very important to respect the intellectual property rights of journals. They earned the IP fair and square by *checks notes* allowing the scientists who wrote the paper to pay them thousands of dollars for the privilege of publication. ↩
The order of authors has meaning in scientific publications. Most author lists are roughly ordered from largest to smallest contribution, except that the last author (also known as the corresponding author) is the first author’s PI, or the senior faculty member who contributed most to the work. ↩
Yes, yes, that probability isn’t conditional on the research you’ve done, how good your application is, etc. Your actual probability of personally getting into each school is different from their historical average because populations and individuals are different. ↩
Or so I’ve heard. I only had one interview, so both interview and application season were stressful :) ↩