I have lots of examples of code I’ve written in the real world available as open source on my Github profile. Lots of these contributions, though, are to external repositories where it might not be apparent which parts of the code were written by me versus others. Below I’ve pulled out a few representative examples in various categories.
Creating reusable charts in D3: A charting library I spun out of my work on the Housing Insights website. Creates reusable chart objects that follow a consistent design pattern and provides boilerplate functionality for creating new reusable D3/SVG charts. I also wrote a blog post describing some of my design decisions.
Javascript, D3, object-oriented programming, data visualization
Adding UI elements for each data type in Housing Insights: For most of our front-end Housing Insights code it is hard to separate code wholly created by myself versus others, as we went through a few refactors and added small feature upgrades that are spread throughout the codebase as we went. One area I had primary later stage responsibility for was refactoring the code that added user interface components (drop downs, sliders, etc.) for each data source in our main browse page. This includes the buildFilterComponents function (line 1145), which for example creates a categoricalFilterControl(line1000) or a continuousFilterControl(line 296), both of which utilize the setupFilter(line 896) function to take care needs shared by both filterControl types.
For a more ‘real time’ view of the code I wrote for the project, check out these pull requests:
Javascript, object-oriented programming, closures, interface design, D3
Add a page-load modal and feature tour: This is a relatively simple feature add of integrating a feature tour library that starts on first page load via a welcome modal. Includes features like a spinner-background if the user closes the modal too soon (before page is loaded) and changing the user options if loading the site with settings saved.
Javascript, HTML
Housing Insights Backend Code: In addition to front end work listed above, I wrote a lot of the Python code for doing data ingestion. Here’s a few of the more substantial pull requests:
Python, Flask, data ingestion, object-oriented programming
Yellowbrick Confusion Matrix: Yellowbrick is a visualization library for use in the machine learning process. It’s designed to be used in conjunction with scikit-learn, and is built in Python using matplotlib. This commit is a visualizer that I created which creates a heatmap of the confusion matrix report provided by scikit-learn, for use in evaluating the quality of classification algorithms. classifier.py
is where the main code is found, and the .png
image shows an example of what is produced.
Python, scikit-learn, matplotlib, object-oriented programming
Data Science Certificate Capstone: I wrote nearly all the code for my data science certificate program capstone program. Some key files to look at are the main ingestion, wrangling, and run_models files. Note that while the code in this project is relatively well structured into reusable functions and modules, it does have a fair amount of temporary and working files, insufficient tests and only moderate code documentation. This was a one-off project and not intended for production. In addition, see the final report for some more guidance on the coding process and findings.
Python, scikit-learn, machine learning, functional programming