Confessions of a palaeobiologist learning to code

Learning to code is something that a lot of people do in their PhD. Programs like R or MatLab are common in highly mathematical and quantitative studies, and in studies with a large amount of data. For this reason, these kinds of analyses and programs have traditionally been uncommon amongst palaeontologists. When you have a sample size of 5, or perhaps 1, heavy statistics or analysis isn’t really necessary.

However, these type of programs are being used more and more since palaeontology is often moving towards more quantitative analysis, including molecular data. They are also extremely helpful for making plots and graphs which look a heck of a lot better than those that come out of Excel, and they allow you to do a lot more with them.

For this reason, I’ve been starting to dabble a bit in coding, which has been a very steep learning curve for me. I have very little experience with anything like this, so basically everything is new. My husband, a physicist/engineer, is proficient in a lot of types of code, and has been helping me out, but it’s tough. I’ve been working with a program called Mathematica, which is like R or MatLab I guess, but unlike R, it’s not free. I’m generally very pro- free/open source software and try not to buy stuff if I don’t have to, but I find Mathematica so much more straight forward and easier to use than the free alternatives. Plus the other half is a Mathematica wizard and can do pretty much anything with it. I generally get along with it ok, and have figured out a few things on my own, but still require a lot of help from Josh. It can be infuriating…

The other thing I am learning and using more comprehensively is LaTeX (pronounced ‘Lay-tech’). For those of you that don’t know it, it’s a sort of word processing software, or in their terms, a “document preparation system for high-quality typesetting”. It’s particularly good at formatting equations, and is therefore substantially used in the areas of physics, math, etc. It’s extremely good at formatting large documents, like a thesis, as it can do pretty much anything you want, as long as you can code it. It automatically makes Table of Contents, figure lists, table lists, formats your figures, places them within the text in convenient places, AND REFERENCES… all these little things that I find infuriating in Word, and that can be unmanageable in a large document. The references especially are amazing. They are so easy to reformat into whatever format you want, and it’s less bulky and buggy than something like EndNote, which I currently use when I’m using Word.

So I find myself trying to figure out LaTeX code… It’s not really code in the normal-sense, but it’s kind of code. It can be extremely frustrating, but when it works, it is so wonderful. Unlike Word, where I find that I’m never really happy with how the document looks, but it somewhat works so I give up. However, it has a tendency to break… like the other day, when my thesis code suddenly broke. I’ve been running off a thesis template from the university that is apparently super bug-y and way more¬†complicated than it needs to be, and earlier this week it decided it wasn’t going to work anymore. Now I am in the process of re-building all the important parts that were in that code (title page, abstract, authorship declaration… all the little things a thesis needs) and keeping with the rules of the university. I’ve been getting a lot of “fatal errors”, followed by “Missing $ Inserted”, which are generally things I have no idea how to fix… but slowly working through each error 1 by 1.

Learning to code in any format towards the end of my PhD has bee frustrating and hard, but totally worth it. To be honest, most of the times I feel like a complete idiot and it takes ages for me to find the problem. But once I do, it’s rewarding and gives me a nice document to look at after. The figures (particularly the graphs and plots) are so much better having done them properly, and my thesis is manageable and well formatted. To anyone considering learning any of these coding languages, I highly recommend it. I only wish that I had got on this train earlier so that I didn’t have to learn how to do it at the same time as making my figures in my thesis. It’s added to the list of things to focus on more after I finish my PhD, presuming I have some time before starting a post doc or getting a job elsewhere. For now, it’s only working on whatever is necessary to get my thesis done…

And now, back to my PhD since I shouldn’t be blogging anyways. Not doing such a good job at keeping up with #thesissaysno lately!

3 thoughts on “Confessions of a palaeobiologist learning to code

  1. I suppose my only comment about Mathematica or Matlab is that you may find things you want to do that aren’t in either, while R has an extensive set of libraries written by scientists. I ‘chose’ R because all the stuff I wanted to do with phylogenetics was in R. And because I wanted to go beyond what they did with phylogenies, I had to learn to code extensively in R so I could build on the available foundation.

    Of course, Mathematica can do some things with mathematical variables no other language can do (evaluate statements where variables are undefined), and Matlab has a number of visualization and physics-related functionalities. And, actually, a lot of non-phylogenetics bioinformatics stuff isn’t in R, but is in Python.

    In general, this means you often have to work enough with a language to at least be able to call the functions you need in those languages, even if you still use some particular language for actually building your own functions in.


    1. Thanks for the comment! I suppose that most of what I’m doing right now is super basic, so Mathematica works well. R is on my list of stuff to potentially learn in the future, should it be relevant to what I’m doing. Currently I’m not doing a lot of phylogenetics, but it’s possible that I will be in the future, in which case R would be very useful.

      Interesting comment about Python. The only people I’ve heard talking about Python are physicists.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.