Running a server on Microsoft Azure: A UX report

Given the proliferation of open source cluster managers (for example, Mesos and Kubernetes) and PaaS software (for example, Deis and Dokku), the infrastructure layer is being abstracted further away from the application. The result is a cleaner separation between development and operation. The simplest example is probably Dokku. You install Dokku on a server, then git push to deploy your applications. That's it. Just as you would do on Heroku. There are no Puppet or Chef scripts to mess with to deploy your app. Given the freedom that these tools provide for less demanding deploy workflow, I've decided to try another Infrastructure as a Service.

I spinned up a Ubuntu 14.04 virtual machine on Azure, set up Dokku on it, and deployed my applications for a data privacy project. From account sign up to a running stack took less than an hour. There were no surprises, and the process was straightforward. Sign up for an account, then follow the instructions. Once you're in the portal, click on the giant New button to create a virtual machine. The UI itself guides you step-by-step through a horizontally expanding panel. Don't Make Me Think. I like it.

Azure new VM

I have to point out that the UI I'm showing is a preview for the next portal version. Compared to Azure's old UI, the improvements on both function and esthetics are evident. For example, in the old UI, they had a quick create for Ubuntu machines, but using that would mean you could only log in with a password. Which is bad practice. If you want to use an SSH key, you need to go through the detailed settings (screenshot below), which aren't apparent for a user to try. You can't use an RSA key either, and need to upload an X.509 certificate. The new UI has neither of those problems. There are a few small annoying quirks like that in the old UI. I would recommend anyone trying Azure for the first time to use the Preview Portal. You can opt-in at your Account menu on the top right.

Azure old VM

Once your VM is up and running, which took less than a minute for me, the instance view (back to the new portal preview) is informative and clean. Grab the SSH information from there to connect, and you're in. I'm impressed by the fact that information which I would want to see on each page is exactly where it should be. It's like Microsoft made the effort to think about the UX.

Azure VM information

Navigating to the Home dashboard, you can recognize Microsoft's signature Metro design. And guess what, billing information is right there! No need to jump to another part of the site like on AWS.

Azure dashboard

As much as I like the Azure management portal, in reality it might not be the best choice for many ops people. If you're an ops person spending most of your time on the web management console, then you're probably doing it wrong. For my use case of spinning up a server or two to host personal projects every now and then, the management UI is a convenient tool to have. Azure seems to beat AWS on the web design. However, if that's your use case, you probably don't need an IaaS and would get better value with a smaller provider like Linode or Digital Ocean.

I haven't had the chance to use Azure much yet. Anyway,the server is still up and running after a week. I haven't needed to tinker with it , which is a good thing. The obvious negative at the moment is that the free trial only lasts 30 days. Other than that, so far so good.

Posted 19 June 2015 in computing.

19:57 from Castro to Twin Peaks

I've been running from Castro Station to Twin Peaks almost every morning for a month. I started off at about 25 minutes and 30 seconds from bottom to top. Today on my last day here in San Francisco, I broke below the 20 minutes mark. It's a 2+ mile distance up 800 ft. Pretty proud of myself for making it in 20. Here's a step through of my run using Google StreetView.

The start at Philz Coffee.

Left turn onto Castro Street.

First right onto 19th Street.

Left onto Diamond St. First hill.

Right onto 20th. More hill.

Up these stairs. More to come ...

At the top of the stairs, go across the street and up these steps to run along the elevated sidewalk on the right of Douglass St.

Turn right onto Romain St. This is a steep hill. Go all the way to the end.

Up this spiral and cross Market St. with the pedestrian bridge. Continue on Romain St. on the other side until the end.

Make a left on Corbett Ave. and follow the curves.

Up Hopkins Ave. on your right, which is the steepest road on this run. I've been told the house at the bottom of Hopkins has had cars rammed into it twice.

Go left on Burnett Ave.

Up these stairs just around the bend on Burnett Ave. It's all stairs from now on ...

Keep going up, up, and up.

Once you've reached the end to Parkridge Dr., go right follow the left bend up the hill.

Get on these last set of stairs on your right.

Almost there! Take a right on Twin Peaks Blvd.

There's a plaque on the ledge there. That's my end of the run marker. Enjoy the view!

Posted 15 June 2015 in journal.

A magical promise of releasing your data and keeping everyone's privacy

Differential privacy is one of those ideas that sound impossible. It is a mechanism outputting information about an underlying dataset while guaranteeing against identification attempts by any means for the individuals in the data [1]. In a time when big data is so hyped on one hand and data breaches seem rampant, why aren't we hearing more about differential privacy (DP)?

I quote Moritz Hardt from his blog:

To be blunt, I think an important ingredient that’s missing in the current differential privacy ecosystem is money. There is only so much that academic researchers can do to promote a technology. Beyond a certain point businesses have to commercialize the technology for it be successful.

So what is differential privacy? First of all, DP is a constraint. If your data release mechanism can satisfy that constraint, then you can be assured that your data is safe from de-anonymization. DP came out of Microsoft Research initially and it's been applied in many different ways. There are DP implementations for machine learning algorithms, data release, etc. Here's an explain-like-I'm-5 description courtesy of Google Research Blog on their RAPPOR project, which is based off of DP.

To understand RAPPOR, consider the following example. Let’s say you wanted to count how many of your online friends were dogs, while respecting the maxim that, on the Internet, nobody should know you’re a dog. To do this, you could ask each friend to answer the question “Are you a dog?” in the following way. Each friend should flip a coin in secret, and answer the question truthfully if the coin came up heads; but, if the coin came up tails, that friend should always say “Yes” regardless. Then you could get a good estimate of the true count from the greater-than-half fraction of your friends that answered “Yes”. However, you still wouldn’t know which of your friends was a dog: each answer “Yes” would most likely be due to that friend’s coin flip coming up tails.

Googles Chrome uses RAPPOR to collect some sensitive data that even Google doesn't want to store because of end-user privacy risks [2]. With the use of DP, they're able to get access to some useful data that they wouldn't have been able to otherwise.

By this point, I hope you have a sense of what DP is and why it's useful. But how does it work? Luckily, I found out that Moritz open sourced his MWEM algorithm on Github. Then I spent a couple weekends deploying his Julia package and built a web application around it. homepage

The site is live at (note the unsecure HTTP). Give it a try! It doesn't do much yet though. is a weekend hack so I'm not sure if I'll do anything more with it. Email me if you think this can be useful to you. For now the app only takes binary values. So pretend your data are all Yes/No responses. The app can be patched to take in any numeric data. Moritz describes how to do that in his paper [3] and he's open to sharing his existing C# code for reference.

The way the web application works is by exposing Moritz's package as a Restful API using Morsel.jl. The frontend is done with Clojurescript's Reagent. I couldn't find any PaaS that can run Julia applications so I containerized the Julia part in Docker and deployed it. That was a bit annoying to do as I kept finding bugs and had to submit a few patches on the way. I guess not many people are deploying Julia applications yet.

The whole stack is open sourced: web application, DP micro-service, and MWEM algorithm. Let me know of your thoughts.


  1. Ji, et al., Differential Privacy and Machine Learning: a Survey and Review, arXiv:1412.7584 [cs.LG]
  2. Erlingsson, et al., RAPPOR: Randomized Aggregatable Privacy-Preserving Ordinal Response, arXiv:1407.6981 [cs.CR]
  3. Hardtz, et al., A simple and practical algorithm for differentially private data release, arXiv:1012.4763 [cs.DS]

With thanks to Chris Diehl for bringing DP to my attention.

Difference Between Never Did It and Did It

I spent my whole Saturday designing a new homepage for this blog. Pretty proud of the result, I showed it to my wife. She couldn't stop laughing at how bad it was. Not willing to admit defeat yet, I took it to /r/design_critiques/ seeking help from this post: I've been told this looks like shit already. What can I do to un-shit it? A particular comment there (quoted below) led me to write this article:

The design is very boring and it makes you look boring as a result. In a situation like this I'd recommend to just use a good looking template: is a good one for your needs. It's not productive at all to try and do something you're 1. not great at, and 2. not looking for work in. Hope this helps.

Is gaining knowledge by trying new things not considered productive anymore?

Before diving into that though, you're probably wondering what my awful experimental homepage looks like. I wanted it to convey a clean and succinct message but it came out more like a careless job done in 2 minutes. Here's a screenshot.

new homepage prototype

An excerpt from another comment in the same Reddit thread:

Just remember that just as you might make engineering look easy, designers make design look easy. Having Excel doesn't make me a data analyst just as having server access doesn't make someone a web designer.

Emerging from my startup experience, I learned fast to be the Jack of all trades. User experience design, customer development, product management ... I am not so naive as to think that I can just jump in and take over anything. That's not the point. The point is that having some hands-on experience opened my eyes to how wrong I was in thinking I knew what other roles actually entailed.

Take user experience design, for example. I used to think that you just use common sense, right? No, you need to understand the user, understand the system, then somehow bridge that gap between the two. Or sales. You just talk to a lot of people, right? No, sales is about understanding user demand and discovering how their needs overlap with what you can offer.

Everything looks easy from 30,000 feet up because you don't see the details. When you've never done something, you really don't know what you don't know. We don’t realize how we automatically make assumptions and over-simplify things we don’t fully understand as a coping tactic to fill in the gaps. That's a useful tactic in everyday life, as I really can't be bothered with all the details around me. But it’s not so useful when bootstrapping a business as you can easily get blindsided. Once you've done a new job or solved a problem once, you don’t guess or handwave your way around the details anymore. You become aware of what you are unaware of. That is the difference.

Anyway, I answered my own question on whether gaining knowledge by trying new things can be productive. Yes, it can. After that, you just need a bit of practice.

continue   →