Clare S. Y. Huang Data Scientist | Atmospheric Dynamicist

I love to share what I've learnt with others. Check out my blog posts and notes about my academic research, as well as technical solutions on software engineering and data science challenges.
Opinions expressed in this blog are solely my own.


My python library updated to v0.2.0!

I have updated my python library hn2016_falwa to v0.2.0 (see release note! Now it includes functions to compute the contribution of non-conservative forces to wave activity.

Moreover, the documentation page generated with Sphinx is now hosted on readthedocs.org! Check it out!

A side note: somehow I made multiple commits to remedy mistake. The git commands to squash the (3, for example) commits are:

git rebase -i origin/master~3 master
git push origin +master

Wrapping Fortrain Codes in Python

To start with, the documentation in Numpy explains how we can wrap fortran code in python using f2py.

You need a fortran compiler to run f2py. I’ve found a pre-compiled version of GCC readily installed on Mac OS X.

(To be continued)

Compiling tensorflow on Mac with SSE, AVX, FMA etc.

(Ideally, I shall run tensorflow somewhere else rather than on my MacBook.)

When I install keras with Anaconda on my Mac OS X, with tensorflow as the backend, the following warning comes up when running the sample script:

I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA

To use those instructions (SSE4.1 SSE4.2 AVX AVX2 FMA), tensorflow has to be compiled from source. The instructions are available here. Using the following command to build the source:

bazel build -c opt --copt=-march=native --copt=-mfpmath=both --config=cuda -k //tensorflow/tools/pip_package:build_pip_package

I got the following error:

Problem with java installation: couldn't find/access rt.jar in /Library/Java/JavaVirtualMachines/jdk-9.0.1.jdk/Contents/Home

It happens that rt.jar is not present in Java 9. To solve it, install a version of Java 8:

brew cask install caskroom/versions/java8

and then specify the path of Java 8 (Change the version number ‘1.8.0_162’ to that of the version you installed):

export JAVA_HOME=$(/usr/libexec/java_home -v 1.8.0_162)

Afterwards, I try to build tensorflow from source again, and it successfully includes the instructions above.

Setting up MySQL / access with python on Mac OS

Today I wanted to setup an automated kickstarter scraper on Python Anywhere but realized that only MySQL is freely supported there (while I’ve been using PostgreSQL). So, a time to switch?

Here is how I install MySQL on my Mac and have it accessed with SQLAlchemy:

  1. Download and install MySQL from Oracle.
  2. Go to System Preferences to start the MySQL server.
  3. Navigate to the bin directory and login with the temporary password shown at the end of the installation:
    cd /usr/local/mysql/bin
    ./mysql -u root -p
    
  4. Create another set of username and password that you use instead of root.
    CREATE USER username@localhost IDENTIFIED BY 'password'
    
  5. I have installed pymysql and sqlalchemy in Python to access the MySQL database. To access the database:
from sqlalchemy import create_engine
from sqlalchemy_utils import database_exists, create_database

# dbname is the database name
# user_id and user_password are what you put in above

engine = create_engine("mysql+pymysql://%s:%s@localhost:3306/%s"
                       %(user_id,user_password,dbname),echo=False)
if not database_exists(engine.url): 
    create_database(engine.url)			# Create database if it doesn't exist.
    
con = engine.connect() # Connect to the MySQL engine
table_name = 'new_table'
command = "DROP TABLE IF EXISTS new_table;" # Drop if such table exist
con.execute(command)

Executing SQL commands is rather easy by using:

con.execute(command)

Enjoy! :)

Software Engineering Project Note-taking

Learnt a lot from peers today! :D Here are quick notes on packages they have used for their software engineering project:

Docker Hub: You store your docker images there.

DigitalOcean: The docker images are pulled there, together with the images.

Sphinx tutorial: Useful instructions how to set Python documentation up

pydot and graphviz: Draw graphs of objects and arrows

Free Online Deep Learning Resources

These are resources related to deep learning from conversation with friends:

Deep Learning Paper Reading Roadmap

Neural Networks and Deep Learning

Books introduced in newsletter of Data Science Central:

Deep Learning Book

Learning html, CSS, javascripts and jinja2

On my way building a webapp with python and Flask, I need to include input options to make the app interactive. Here are some great sites I’ve learnt things from:

More updates later.

Setting up ubuntu on AWS

Solution for the error libSM.so.6: cannot open shared object file.

Python packages for Sentiment Analysis

On top of utilities in nltk.sentiment, there are also some packages for training and combining classifiers:

(More to be updated)

Published a paper on Local Wave Activity Budget!

I’ve published a new paper on Geophysical Research Letters!

Climate dynamicists have derived a conservation relation based on small-amplitude wave assumption for wave activity (A) that describes evolution of Rossby wave packets:
Wave activity flux equation
However, only the wave activity flux vector on the RHS has been used to diagnose realistic climate data. A is ill-defined when wave amplitude is large (i.e. ‘of finite-amplitude’). In Huang & Nakamura (2016), we introduced a new theory of wave activity applicable to large waves. We thus can obtain a well-defined A even from real data. This is the first piece of work that compare LHS and RHS of the conservative part of equation above for reanalysis data. This advance allows us to estimate the overall non-conservative contribution (natural/human-induced forcings) to the observed flow.

Major results include:

(1) Our estimation of transient wave activity (top panel) is consistent with previous work (bottom panel, assuming small-amplitude waves) and is better behaving.

Comparison with previous work

(2) We can break down the local wave activity budget at seasonal time-scale.

Wave activity flux equation

(3) We can also break down the budget in synoptic time-scale with the use of co-spectral analysis.

Wave activity flux equation

Switching to Jekyll

I’m switching from a traditional html webpage builder to Jekyll user! Hope to update more often!

I set up Jekyll in my Mac OS X with homebrew, rbenv and RubyGems.

To see how to set jekyll up on Windows, refer to my older post.

Installation of Jekyll on Windows 10

Below are the procedures I used to install Jekyll with problem solvers:

Main reference sites:

Jekyll on Windows
Easily install Jekyll on Windows with 3 command prompt entries and Chocolatey

Procedures:

  1. Open Powershell
  2. Run Powershell as administrator: [Reference]

    Start-Process powershell -Verb runAs

  3. Change execution policy to enable installation of Chocolatey (a package manager): [Reference]

    Set-ExecutionPolicy RemoteSigned

  4. Installing Chocolatey: [Reference]

    iwr https://chocolatey.org/install.ps1 -UseBasicParsing | iex

  5. Update the certificate to install ruby: [Reference]
    http://guides.rubygems.org/ssl-certificate-update/
  6. Install ruby:

    choco install ruby -y

  7. Close the window and open a new command prompt with Administrator access (i.e. step 2)
  8. Install gem bundler

    gem install bundler

  9. Install Jekyll

    gem install jekyll

  10. Done :)

Installing Python Library for downloading ERA-Interim Data

Update: ECMWF API Clients on pip and conda

The ECMWF API Python Client is now available on pypi and anaconda.
The Climate Corporation has distributed the ECMWF API Python Client on pypi. Now it can be installed via:

pip install ecmwf-api-client

If you are using anaconda, OS X/linux users can install that via

conda install -c bioconda ecmwfapi

my widget for counting (since Dec24, 2016)