2015 Year in Review

Another post to reflect on all that’s happened in the past year! I’m beginning writing this post while on a flight to Maui for my honeymoon with my lovely wife sleeping next to me. What a difference a year makes!

Some of my major life events:

Books read that I’d recommend — with Amazon affiliate links


  • The Martian by Andy Weir
    • A well-paced and excellent story + lots of science. The audiobook edition is great.
  • The City & The City by China Mieville
    • Recommended to me by Eric White. A very different but imaginative detective story.


  • Diplomacy by Henry Kissenger
    • A serious read. Such an interesting take on the history of diplomacy in European history right up through the early 90s. The twists and turns of Kissenger’s read on Americans from before WWI through today is very thought provoking.
  • Islam and the Future of Tolerance by Sam Harris and Majid Nawaz
    • An important discussion that needs to be had more frequently and more publicly.
  • Superforecasting by Philip Tetlock and Dan Gardner
    • Fascinating read on how everyday people can become much better at predicting future events (think questions like: will Assad give up power by April 3, 2016). I wish it was a bit more how-to in the end.
  • The $12 Million Stuffed Shark by Don Thompson
    • A good read with the next book about the high-end art world. It’s almost too incredible to believe.
  • Seven Days in the Art World Kindle Edition by Sarah Thornton
    • Same as above. If you’re interested in how the art world works, in broad strokes, read these two books together.

Places that I visited

  • Dallas, TX, USA — (family/friends)
  • Mountain View, CA, USA — where I live and work
  • Portland, OR, USA — first consulting trip
  • Amsterdam, NLD — (week in Amsterdam to watch Julija get her PhD cum laude and meet her family!)
  • Austin, TX, USA — SciPy 2015 conference (poster)
  • Portland, OR, USA — OSCON 2015 conference (talk)
  • San Francisco, CA, USA — Got married!
  • Yosemite, CA, USA — Upper Yosemite falls is quite the hike.
  • Maui, HI, USA & Kauai, HI, USA — Honeymoon!

Side projects

Looking forward to what the next year will have in store! The initial plans call for my first trip to Lithuania!


I just wanted to post the guts of a script that Colin Higgins (fellow Data Scientist at SVDS) wrote.

# step1
wget https://repo.continuum.io/miniconda/Miniconda-latest-MacOSX-x86_64.sh

# step2
chmod +x Miniconda-latest-MacOSX-x86_64.sh

# step3 -- have to type spacebar and "yes"

# step4
source ~/.bashrc

# step5
conda update conda -y

# step6
conda create -y -n anaconda_r -c r r-irkernel r-recommended r-essentials anaconda

Now, switch into the anaconda_r environment (which will prepend your PATH in that one terminal ONLY) with:

source activate anaconda_r

and install extra packages like so:

conda install -c r rpy2 -y

This made it so that both the R kernel, python kernel, and the rpy2 package were all working in the same environment (my previous blog post was a temporary stop-gap that couldn’t get there).

{ 1 comment }

Conda error installing R for Jupyter

Fixing the error: unable to load shared object

Update 2015-12-12

I think using conda environments is actually the best way to go, so I recommend looking at my next blog post.

I was having a hell of a time getting the R kernel installed and working with Jupyter on OS X. A blog post from continuum (maintainers of the recommended anaconda installation package) had instructions that showed how easy it was supposed to be. The bottom of this post gives my fix if you have the same errors!

Using conda, I started out by installing r, the kernel, the essential and reccomended packages via:

$ conda install -c r r r-irkernel r-recommended r-essentials

When you run this command you should see some output which says something like:

Fetching package metadata: ......
Solving package specifications: .......................................................................................................
Package plan for installation in environment ~/anaconda:

The following NEW packages will be INSTALLED:

    r:             3.2.2-0        https://conda.binstar.org/r/osx-64/
    r-essentials:  1.1-r3.2.2_0   https://conda.binstar.org/r/osx-64/
    r-irkernel:    0.5-r3.2.2_1   https://conda.binstar.org/r/osx-64/
    r-recommended: 3.2.2-r3.2.2_0 https://conda.binstar.org/r/osx-64/

Proceed ([y]/n)?

Linking packages ...
[      COMPLETE      ]|################################################################################################################| 100%

At this point, you should try running jupyter notebook and opening a new R based notebook. If this works, you’re done. If you try to open a new R notebook, and the kernel immediately dies, while the terminal reports:

[I 10:55:51.639 NotebookApp] KernelRestarter: restarting kernel (4/5)
WARNING:root:kernel aa97d080-6059-4a4b-bd6a-b88b17b9162a restarted

R version 3.2.2 (2015-08-14) -- "Fire Safety"
Copyright (C) 2015 The R Foundation for Statistical Computing
Platform: x86_64-apple-darwin11.4.2 (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> IRkernel::main()
Error in dyn.load(file, DLLpath = DLLpath, ...) :
  unable to load shared object '~/anaconda/lib/R/library/rzmq/libs/rzmq.so':
  dlopen(~/anaconda/lib/R/library/rzmq/libs/rzmq.so, 6): Library not loaded: @rpath/./libzmq.4.dylib
  Referenced from: ~/anaconda/lib/R/library/rzmq/libs/rzmq.so
  Reason: image not found
Calls: :: ... tryCatch -> tryCatchList -> tryCatchOne -> <Anonymous>
Execution halted
[W 10:55:54.646 NotebookApp] KernelRestarter: restart failed
[W 10:55:54.647 NotebookApp] Kernel aa97d080-6059-4a4b-bd6a-b88b17b9162a died, removing from map.
ERROR:root:kernel aa97d080-6059-4a4b-bd6a-b88b17b9162a restarted failed!
[W 10:55:54.663 NotebookApp] Kernel deleted before session
[W 10:55:54.663 NotebookApp] 410 DELETE /api/sessions/6c791526-dc7d-41f3-95a7-b95f61400b0e (::1) 1.76ms referer=http://localhost:8889/notebooks/Untitled5.ipynb?kernel_name=ir

Then you’ve got a problem. If you have the same error as me, read on, otherwise, good luck!

Searching for the above error, I find this page which links to this page which says, cryptically:

Linking errors on Linux/OS X:
RZMQ/ZMQ version mismatch. You likely updated the system ZMQ lib and now RZMQ points to a nonexistent .so/.dylib file. Reinstall (and force recompilation of) RZMQ.

Tried a few things to do what’s suggested. Failed.

At the bottom of this GitHub issue thread was the recommendation to do:

$ ln -s ~/anaconda/pkgs/zeromq-4.0.5-1/lib/libzmq.4.dylib ~/anaconda/lib/libzmq.4.dylib

However, I needed to modify that line, so check if ~/anaconda/pkgs/zeromq-4.0.5-1/lib/libzmq.4.dylib exists, and if not, find the version that does, then link that like:

$ ln -s ~/anaconda/pkgs/zeromq-4.0.5-0/lib/libzmq.4.dylib ~/anaconda/lib/libzmq.4.dylib

Retrying $ jupyter notebook now worked for me!


Here is an excellent talk by Michael Manapat at the PyData Seattle 2015 conference. I wish that this style of talk — of really digging deep with specific examples — becomes more common!

Michael Manapat: Counterfactual evaluation of machine learning models

The slides can be found here, and the paper that it’s partially based on is here.


Jupyter Notebook Best Practices for Data Science

I gave a talk on Friday (July 24) at the 2015 OSCON in Portland, OR. My topic was on the IPython (Jupyter) Notebook for Data Science, and it highlighted a number of challenges that come from needing to organize a data science workflow — especially in the context of working on a team of data scientists.

The video of my talk (not available just yet) is below:

I had a great time and I hope people find it useful. The github repository for my talk.


2014 in Review

Berlin Thanks to everyone who helped make this past year great – I’ve been incredibly fortunate to have people who have helped support me in all of my adventures. Below are a few highlights from 2014!

Some of my major life events:

  • Finished my 3 year postdoc under Michael Murphy at Swinburne University of Technology.
  • Published a paper with Michael that was the culmination of years of work. The pdf is here if curious.
  • Moved from Australia to the San Francisco Bay Area.
  • Brought Julija home to meet the parents over Thanksgiving.
  • Completed the Insight Data Science program.
  • Started at SVDS as a Data Scientist!

Places that I visited (and spent at least two nights this year)

  • Dallas, TX, USA (family/friends)
  • Washington, D.C., USA (AAS)
  • Phoenix, AZ, USA (visit Stephanie/Kelsey)
  • San Diego, CA, USA (talk at UCSD)
  • Melbourne, AUS (postdoc life, Marc visited!)
  • Hobart, AUS (Dave!)
  • Paris, FRA (week in Paris)
  • Amsterdam, NLD (week in Amsterdam –Julija!)
  • Zurich, CHE (Kern!)
  • Glasgow, GBR (IMAX Glasgow)
  • Cambridge, GBR (talk at Cambridge)
  • Berlin, DEU (photos)
  • Potsdam, DEU (visit at Potsdam University)
  • Sydney, AUS (Harley Wood Winter School)
  • Palo Alto, CA, USA (Insight Data Science)
  • Dallas, TX, USA (family/friends)
  • Mountain View, CA, USA (started work at SVDS)
  • Dallas, TX, USA (family/friends)

Fun final list:

Seasons experienced this year (in order)

  • Winter
  • Summer
  • Autumn
  • Spring
  • Summer
  • Winter
  • Summer
  • Autumn
  • Winter


One chapter closes; a new chapter opens

As of this week, I am officially no longer an astrophysicist. I start my next career as a data scientist in about a month. It’s been a fantastic experience for me both personally and professionally. Michael Murphy was an incredibly patient and encouraging boss from whom I learned more than I hoped.

Coming into this job I hoped to learn a ton, and I have, but many opportunities and experiences were completely unexpected. From observing at observatories like ESO’s VLT in Chile and Keck in Hawaii, to being in an IMAX film. Finally, the amazing amount of travel that my position granted me was life altering.

This is not my big farewell post, as I still have a week and a bit in Australia, but I had trouble sleeping last night so I decided to make a fun D3 map of all of the flights that I’ve taken during my time as a postdoc — starting with the flights from DFW (Dallas-Forth Worth) to LAX (Los Angeles) to MEL (Melbourne) in August 2011!

The code that I used to make this is available at this link.

Ok future, let’s see where we go from here.


2014 Harley Wood Winter School Invited Talk

Reproducible Open Notebook Science

This past weekend I gave an invited talk at the Harley Wood Winter School in Collaroy, New South Wales, AUS. It was an excellent conference at a beautiful location, and definitely a treat to be asked to speak about scientific computing, the future of reproducible open science.

Here’s the abstract from my talk:

Full-stack science workflow += the IPython notebook

My talk will range from setting up a bashrc, to how you spend your time on a day to day basis, to the ultimate goal of clearly communicating reproducible scientific results. I’ll have many examples of common pitfalls to avoid and a few tactics that can get you series of small wins in the battle that we call research. Finally, I will demonstrate the IPython notebook which I think will become a game changer for sharing reproducible science. I will make my slides and my code publically available after the talk.

I was a bit nervous because the talk was going to include me doing interactive coding and demonstrations — which is always dangerous — but I think that it ended up going rather smoothly. As promised, I am making my slides and random examples available. It also gave me the opportunity to talk about where I hope to see science heading into the future. Reproducible code, shared and open data and notebooks. The ability to reproduce the exact plots in a paper is now easily upon us, and we should strive to have this be the standard going forward.

It was also my last talk that I will give as an astrophysicist because I’m starting the Insight Data Science Fellowship program in September! I’m very excited about that, and I also wanted to share with the audience about so-called ‘Plan B’ careers trajectories out of academia.

First my (interactive) slides

My first set of slides — I tried to export into a reveal.js slideshow, but I failed, so it’s one long (downloadable) IPython notebook.

Second set of slides which includes the Bayesian Blocks example from the AstroML: Machine Learning and Data Mining for Astronomy library.

Example of the future of science

A possible flow of events

You can now email that link to anyone in the world who has a browser. No python, no IPython, nothing needs to be installed. The barrier to sharing the analysis here is about as close to zero as we can get.


I repeatedly tried to make the case to think about your workflow — the more often you do an action, the more you should think about optimizing it.

  • A couple of useful .bashrc commands to make life easier: bashrc
  • This includes the save function which allows you to simply cd example and return to the saved directory (stored for future use as well).
  • Sublime Text — A text editor worth getting to know (Available OS X/Linux/Windows)
    • How to get LaTeX installed — excellent blog post.
    • And this blog post as well.
    • Finally, but most importantly, this series of screencasts of how to effectively use Sublime Text. Worth watching all the way through once, using Sublime Text for ~ 1 month, then rewatching.
  • Divvy — keyboard shortcut call up a window and resize to custom sizes (Available for OS X/Windows).


Currently recommenidng getting python, IPython, IPython notebook through the Anaconda installation method.

A list of a few python tutorials:

IPython notebook links

  • Notebook Viewer
  • The IPython notebook is moving to the Project Jupyter in the near future. Don’t worry this is the same old IPython notebook thing, but it’s rebranding because it now supports R, Julia and other languages.


Besides the few lines I recommend adding to your .bashrc above, these are a couple of handy snippets that can be used in bash:

# iterate over numbers from 1 to 12
for index in {1..12}
    echo example.$index.name

# iterate over all command line arguments
# Save this in a file called demo.bash then run the command
# bash demo.bash hi this works
# to see what happens.
for name in $@
    echo $name

Let me know if you saw the talk and what you thought of it! Or if I forgot to put a link to something that I mentioned.


ipython notebook tips and tricks for science research

David Lagattuta and I gave a seminar at CAS about using python, ipython and the ipython notebook. At the end of it we made 3 of our ipython notebooks available to all.

These notebooks have embedded in them examples of images, text, LaTeX, and even embedded YouTube screencasts that explain aspects of the notebook within the notebook itself.

  1. ipython notebook: Future of Sciencehttp://nbviewer.ipython.org/5742826
  2. Clean Code; Clear Codehttp://nbviewer.ipython.org/5742829
  3. The Future of Science — EXTRAShttp://nbviewer.ipython.org/5742830

Here’s a screencast that we embedded which shows some neat ipython notebook features:

As always, I’m looking to improve my coding (and presentation) skills, so constructive criticism is requested.



Hidden Universe IMAX 3D

I’ve been involved (http://hiddenuniversemovie.com/the-film/the-astronomers/) with the production of a 3D IMAX film called Hidden Universe.

The locations it’ll be showing (so far) are here: http://hiddenuniversemovie.com/theatre-locations/

Some photos that I took of the filming in Chile (back in November 2012).

2012-11 IMAX


Enhanced by Zemanta