Last interview question you’ll ever need: Kobayashi Maru

Screen Shot 2013-11-16 at 12.58.53 AM

It’s becoming more and more difficult to assess candidates. People can easily google “interview questions” and get 833 Million results. Websites offer “smart answers to tough interview questions“. They can tell you exactly what questions were asked last week at Google. Sites go as far as coaching you on what emotions to show and what corny jokes to tell. It’s an interviewing arms race that has interviewers searching for more and more inane questions like “Why is a manhole round?” and interviewees feeling like they’re studying for a standardized test.

In Star Trek lore, there’s a test at starfleet academy called the Kobayashi Maru. It’s a no-win situation where either you let a Federation ship get destroyed or get yourself destroyed by attacking the Klingon fleet. The simulation is designed to see how cadets react to an impossible situation. I use this same strategy in my interviews. I present them with a scenario that is relevant to the particular job but let them explore and struggle through the simulation. I even play the evil computer program and change the assumptions as I go. No two interviews are the same. For example, for web developers the scenario is that having just programmed your app, you check it out in a browser and see a blank page. What do you do?

The nature of work is changing. The interview methods of the past don’t meet the needs of today’s jobs, which require more creative problem-solving and less memorization and assembly line widget-building. What you really want is to see how candidates think and handle new situations. It’s quite easy to tailor this test for your needs. Hiring a systems engineer? Put them in an outage situation and ask them what to do. You’ll find out within 5 minutes or less whether you want that person handling your servers. If you’re looking to get hired, impress me with your knowledge of the system. Bonus points for cheating.

Project Management for Startups

In large companies, project management is a must; in startups, it’s a radioactive “hot” potato. Large companies have dedicated project managers and sometimes even a Project Management Office (PMO). Small companies like ours don’t have one and we’re pushing 60 employees now after 18 months. Eventually, it’ll make sense to have a full-time project manager, but until then we want to stay agile so we get by using software, process, and face time.

Use the Right Software

When our Product-Engineering team consisted of CEO, CTO, and two engineers, email was our issue tracker and a whiteboard contained our roadmap. That worked well. Soon we hired a few more engineers and email proliferated. Pivotal Tracker (PT) became our issue tracker (still free then!). PT is simple, light, and an engineering team’s dream–a place where only engineers could hang out. We upgraded to Google Spreadsheets for our roadmap so we can do more sharing and simple Gantt charts. Once we hired some product managers, added a QA team, and started developing many projects at once, it was time to swim in the Olympic-size pool: Jira. Besides being created by a hilarious bunch, Jira does everything. That’s part of the reason we resisted using it early on, since we didn’t want to fill out so many corporate-y fields like “hours worked” (::shudder::). As for our roadmap, it broke Google Spreadsheets and I grew tired of the “aw, snap!” pages. I’m experimenting with a few tools and so far I like Asana and Smartsheet. Find what works for you now.

Evolve Your Process But It’s all About Face Time

We tried to add only as much process as we needed over time. You start out with a fast track from feature conception to release. When you have two people who work on a project, you can sit down and have a heart to heart, nod heads, and then start coding. When a product manager has to convey thoughts to a few developers and a few QA–this is a small team still–details get lost in translation. Not only that, conversations may happen between members of the team that aren’t shared with the rest of the team. Let the frustration and angst begin. For a while, we went with the Product Requirements Document (PRD) approach, which is very monolithic. We’ve since moved away from that. Recurring meetings get moved around and their titles and agendas change. We went from 1-week to 2-week releases. Processes should change when the team changes or when the needs of the team change. It all comes down to getting a bunch of people to share ideas and work together efficiently. All this software and process is meant to cheat time–time to talk to one another face-to-face that is very hard to come by.

 

Why your engineering team needs Vagrant

One of the first things I told our CEO when I joined as the first engineering hire was to buy every engineer a 15″ Macbook Pro. I wanted to have a consistent development environment for my team. I created a Google Doc and at first I ran through this How-To personally on every new machine for every new hire. The last 3 or 4 hires were given this doc on their first day and their setup time became part of the “How quickly can I setup my dev box” game. The record is 4 hours but he had most of the dependencies setup. The longest was two days.

It’s All Fun and Games Until Someone Upgrades

In the last 18 months, Apple has changed the processors, memory, and countless other aspects of the 15″ Macbook Pro. OSX has gone from Snow Leopard to Lion to Mountain Lion. Ruby has gone from 1.8.7 to 1.9.3, Rails has gone from 2.3 to 3.1, and I can’t keep track of all the versions of gems that have changed. The development world–it changes fast. How bad could it be, really? Well, I include 4 different manual “patches” in that doc of mine to make mysql work. The working patch depends on whether you have 32-bit or 64-bit OSX, your version of Xcode and mysql, whether you used a tarball or a dmg, whether you use mysql or mysql2 gem, etc. Lately, I just tell new hires to try everything until they find something that works. Even for packages installed via macports, certain version of ports don’t install properly. Some of these can be fixed by editing the portfile with information culled from the Interwebs. Other times, people just give up and use the homebrew version. This is us at 10 engineers.

Remove Doubt

Development and debugging can often lead to situations where you try to find what is different about a working scenario and a broken one. If a test passes on my machine but not on yours, I don’t want to say “it could be because you’re using Rails 2.3 vs 3.2 or Ruby 1.8.7 vs 1.9.2″ or god-forbid “it could be because you’re using Windows”. Ideally, I would like to say “it’s because you forgot to pull the latest code” or  “your custom configuration file is not consistent with mine”. The more differences between one environment and another means the more variables to consider–follow the rabbit down the hole.

Abstraction FTW

Just as rvm helps engineers manage Ruby versions and bundler, Ruby gems, Vagrant helps engineers manage their entire development environment. With the concept of “boxes”, an engineering team can build Virtual Machine images to suit their every need. Want to onboard a new hire? Give then a computer and have them load your team’s starter box in 5 minutes. Want to roll out security updates to all 500 of your engineers? Script everything up using Chef or Puppet and have everyone download a new box. Want to test how your app would run in Ubuntu rather than Gentoo or OSX? I just did that while writing this post. When do I really need vagrant? For small teams (2-3), you’ll probably live without Vagrant. However, if you want peace of mind or plan to scale, check it out now.

MongoDB + Hadoop = gg

I’m not sure if kids these days still do this, but when I was a gamer and I had just beat someone soundly, I simply typed “gg”. It’s short for “good game” but it really means “I just dominated you” or “game over, loser”. When you use Mongo with Hadoop, it’s kinda like that. With Mongo, you get a flexible, scalable database that excels at real-time processing. More and more startups are using it today and it’s our primary database (here’s why). With Hadoop, you get a distributed processing framework that handles everything you can’t or don’t want to process in real-time. It’s even easier to scale than Mongo, Amazon’s productized it (Elastic Map/Reduce), and it’s the swiss army knife of Big Data. Here are some reasons why they are a match made in data store heaven:

You Complement Me

Mongo is fast; it’s optimized for speed. Say goodbye to transactions and joins and other features you may not need. You can use it as your primary database to support your application in real-time. It’s not so good on large datasets or complex querying. You lost joins remember? And you want to shard my what? This is where Hadoop comes in. If you’re thinking about doing Big Data analytics, take those Nginx logs and crunch those numbers in your Hadoop Cluster. Or if you’re tinkering with the latest machine learning algorithms to predict your users’s preferences–Taste Graph anyone?–it comes in handy.

Two Words: Map and Reduce

Map/Reduce (M/R) is at the core of Hadoop. It allows you to break down complex tasks into manageable chunks of data and processing. Mongo took a page out of Hadoop’s book when it included an implementation of M/R. It makes it even easier, in my view, because writing the functions in Hadoop’s native Java is usually more confusing then writing Javascript for Mongo. Add some Ruby and you’ve got dynamic M/R in your Rails app! You can write mappers and reducers in Mongo to validate your Hadoop Java code on a smaller data set. And when you’re ready for the big show, you can fire up your 1000-server cluster to find the question to 42.

Have Your Cake and Eat It Too

If you don’t know what to choose for your task, you can always use both at the same time. With a plugin, you can use mongo as an input or output for Hadoop. It even has some optimizations for splitting the input on every chunk in a sharded environment. We’ve tried this for one of our features and it works very nicely. Eventually, if your data requirements may grow such that you’ll have to go fully into Hadoop, but you can get away with this hybrid approach for a long time. If you’re looking to speed up processing time, you could farm out some data to Hadoop, have your cluster crunch the data in bite-size chunks, and do some more processing in your application–all within a Resque job.

Art Credit

Lessons from Codecademy

Image

My wife wants me to teach her Ruby so I did something very DRY: I tried out Codecademy. This new EdTech site teaches you to program with interactive lessons. I know they’re onto something because I have many things to say and suggestions that could apply to any website.

Props

The UI is very clean and the flow is better than average, especially for the level of complexity in the lesson creator. The progression is very clear and the gamification adds some color (more on this later). I love the live interaction using the console and the conversational style of the lessons. I haven’t done any of the advanced Javascript lessons, but I did spend a few hours and managed to create my own lesson for Ruby.

Things to Improve

Make You Model Structure Crystal Clear

It’s pretty hard to grok the way things are structured on the site. There’s a diagram in the documentation, but one of the main things holding me back was understanding the connections between Topic, Section, Exercise, Lesson, etc. I often tried to reference the examples form the core Javascript lessons but there seemed to be a disconnect from that and the lesson creator. This is understandably new ground and complex, but whether I’m a teacher or the student on the site, I won’t get far if I don’t get the model structure.

Poor Linkage of Documentation to UI

Creating a lesson could be simpler. There seems to be an entire documentation section, but it would have saved me a lot of headache if there were links in the lesson creator directly to the pertinent sections of the documentation. For example, when you’re trying to specify the teacher code that checks the student submission, there’s a pop-out with three examples. What is critically missing is the fact that you have access to three variables called “code”, “result”, and “error”. Without this nugget, the teacher will be scratching their head for a bit. In fact, I would suggest making the configuration for the core lessons from Javascript viewable by lesson creators (I was trying to mimic it anyway) or even creating a lesson to teach you how to use the lesson creator.

Get Serious About Gamification

Even though the badges are cute, it’s obvious that the game mechanics are not well thought out. I’m not the game designer in my company, but I’ve been around them long enough to know that you probably want to have leaderboards, levels, progress, more messaging on the “how” and “why” of the system, etc. As a student and a teacher in the system, it wasn’t clear to me where I was in relation to other players and where I should be going. This part’s not as big of a deal, since they seem to have their core competency down, engagement is a big part of any crowd-sourced content site.

Overall, Codecademy has captured the core experience, which is why I would use it and from what I hear 500,000+ other people would too. One additional philosophical argument I would echo from this piece by Audrey Watters is that the site is lacking the conceptual component. Sure, there’s a “glossary” that you can link to from what I can glean in the Markdown examples (yeah, I made the connection!), but it’s almost an after thought. It’s equally important to weave conceptual learning into this experience as well. But it’s likely they’ll add that soon since they only launched last August.

When you should use MongoDB

Image

Having built an enterprise SaaS gamification platform with hundreds of millions of documents and soon-to-be billions as we grow from hundreds of clients to thousands in the next few years, I’ve thought a lot about MongoDB as a primary database. We are pushing it to the limits and are living pretty close to the bleeding edge. Thus far we’ve been pretty happy with the choice but Mongo isn’t for everybody. I get the sense that some people are trying to use it for the wrong reasons and then complain when things don’t work out. Here are some of the reasons why we decided to use MongoDB:

We wanted a dynamic database schema. We are a Behavior Platform. We record arbitrary behaviors for our clients and do interesting things with them. For example, “Joe commented on a article” could easily be “Joe checked-in at a bar in San Francisco called 21st amendment on 3rd Street in the SOMA district”. By offering almost infinite flexibility, we can support a variety of use cases from e-commerce to education. This was our most important requirement. As a bonus, Mongo offers asynchronous indexing and thus we don’t need to do database migrations and all deploys now require zero down-time.

We wanted something that scaled easily. Given that we’re a platform and our data grows with the number of clients we have and over time, we aren’t your ordinary build-it-and-hope-they-will-come website. Our configuration started with Master-Slave, then Replica Sets, and now Sharding. In some of our applications and in certain environments, we still use non-sharded setups. Mongo makes it easier to setup these configurations but there’s still significant time involved to develop and harden your infrastructure. Once it’s setup though, it’s really nice to watch as data gets sharded automatically and rebalanced in realtime. It’s also nice to know that you have multiple redundant servers with automatic failover.

We wanted Map/Reduce. On top of flexibility in storing the data, we wanted flexibility in processing it. Mongo gave us the ability to develop certain features very quickly because our primary database supported this rich framework. We even wrote some early analytics implementations using the native Map/Reduce in Mongo. At a certain scale, doing Map/Reduce on your primary database will dramatically hinder normal performance, but Mongo gave us plenty of time to port our mappers and reducers easily to Hadoop.

There are many reasons to stick with a relational database. If you need transactions or you prefer the comfort of many more years of “bake-in” time, NoSQL will not sit well with you. With Mongo, there will be a smaller community of developers and less tools definitely fewer stack-overflow questions. However, if you’re trying to build an exceptional application or platform with ambitious requirements, Mongo might be the one for you.

Thinking Beyond Pair Programming

Image

Pair programming can improve productivity but it’s most likely hiding more elusive problems. It could be that the feature is not clearly defined or that the team member is not motivated to work because you’re not giving her enough creative leeway. It may even be that she is not technically equipped to implement the feature.

The idea is simple and that’s how it seduces you. As a manager, you may try this technique and see increases in productivity–but are you unlocking the true potential of each individual? It is far more likely that you’re still missing the real issues at hand. Before you adopt pair programming ask yourself what the underlying hindrances are to productivity.

Sometimes it makes sense to pair a junior programmer with a mentor for training purposes but for the most part individual programming with a lead helping to remove roadblocks strikes a good balance. Furthermore, for a small engineering team (2-10), which is the norm for most startups, you should be programming as a team.

How to Enable Real-time Web Development

Startup Weekend puts your team in the same room for a reason: to facilitate real-time collaboration. With the right setup, your team can develop a web application live. Let’s assume you have a mixed team including developers, designers, teachers, and entrepreneurs. Assume everyone has a laptop on the same network. Using git and Ruby on Rails (the example can apply to any source control and web framework), you could have each laptop run a local web server. Everyone would be able to connect to any team member’s server and view the application as it changes with every code tweak. Now the developers can build the infrastructure and add styling, the designers can drop in images and icons, the teachers can edit the content, and the entrepreneurs can proofread the text and give feedback on design–all at the same time.

This is possible today because my team did it in a limited way. The designers didn’t run their own copies of the server but merely connected to one of the development machines to view the live changes. Teachers and entrepreneurs watched changes happen live as well, suggesting content and copy fixes in realtime. Furthermore, although some team members did not change the code directly, with minimal setup and training, I believe anyone should be able to easily update content or copy at least since web frameworks strive to separate logic from static content. With this level of collaboration, product development can happen in real-time.

Open Source and The Evolution of Full-Text Search

Image

A few months back, I was asked to estimate how long it would take to implement scalable full-text search. I instinctively cringed and started to give my standard expectation-setting reply where references would be made to the complexity of Google and keywords like “stemming” would be floated like mines in the ocean. But I caught myself–instead, I replied “I have a few options in mind but let me do some research and get back to you”.

With Open Source software today, the pace of innovation is so fast that in many situations it makes sense to spend time researching the latest and the greatest even it’s only been months since you last checked. This was my third run-in with seach. The first was with Ferret and its infamous fling with the Rails community a few years back. Because of performance reasons, Engineyard added it to a naughty list (as seen above) and there it has remained. My second involved Sphinx. It took me months to write integration tests, tweak performance, and configure the options. With elasticsearch, it was like using an iPhone for the first time minus the $299 + 2-year contract.

What can Open Source do for me (an engineer)? Full-text search is one of many tools that has matured after years of iteration within the community. Instead of continuously reinventing the wheel, we help each other build ever more powerful components. In fact, joining the effort could mean as little as using software and reporting bugs. Open Source is also a great way to learn new techniques and shed bad habits. Because everyone is watching, transparency encourages accountability.

What can Open Source do for me (an entrepreneur)? In the past, Open Source software meant unreliable software to entrepreneurs. Today, it’s our ticket to a quick launch and fast iteration. Along with its commercial cousin, Software as a Service (Saas), Open Source can take care of many non-core parts of your application like talking to the database and rendering pages (Rails), sending emails (SendGrid), and now search (elasticsearch). No matter how impressive it would be for your engineers to build you a custom email solution, their time would likely be better spent figuring out how to get those pins to stick to the board.

Demystifying the Cloud

Image

One of the hottest buzz words in the Valley today is “Cloud”. You see it on accident-inducing billboards like Microsoft’s “Virtualization alone does not a cloud make”. You hear murmurs while grocery shopping like “Omg, Apple’s coming out w/ the iCloud–it’s gonna be sick!”. But before you call Microsoft for some meditation lessons or march down to the Apple store for an iCloud 5, take a reality check with me.

The Cloud will not make your web application better. If your application sucks now, it will not get any better in the Cloud because normally the code for your application will be exactly the same whether you run it on your own servers or on Amazon’s. There are many ways the Cloud can help your cause but most of these relate to infrastructure and not your core application.

The Cloud isn’t for everybody. There are cases where you shouldn’t leverage Cloud Computing. For example, if you’re feeling adventurous with the law and start a gambling website where latency and security are top concerns, it might be prudent to your buy your own hardware up front. With your own hardware, you always have more control.

The Cloud isn’t always cheaper. Depending on your needs, it might be more expensive to host your application in the Cloud than on your own hardware. For example, Amazon’s rates for various instance sizes scales with CPU and memory. In many cases, your application will need some of the more powerful and expensive instances. At some point, it will make more sense to ‘buy’ than to ‘rent’ these instances.

Some of the hype is true.

What can Cloud do for me (an engineer)? The Cloud allows web developers to abstract further away from the hardware we see as less predictable and more frustrating than code. Just as programmers have moved beyond writing less code (think 70s and punch-cards) and beyond worrying about memory allocation (think coding in C), the Cloud allows us to worry less about infrastructure. The Cloud allows us to easily setup as many environments as we need for our code. For example, in a typical web startup, you probably need a “staging” environment where developers can experiment, a “qa” environment where QA can verify release candidates, and a “production” environment for the real deal. Depending on your needs, there can be other variations of code + data required (e.g. a “sandbox” environment for partners), but the Cloud, in conjunction with services like Engineyard, gives us the ability to easily manage this complexity. The Cloud helps engineers focus on the important stuff.

What can Cloud do for me (an entrepreneur)? The Cloud is a driving force for lowering the barrier of entry for new web-based technology. More importantly, the lower costs allow entrepreneurs to test ideas overnight with practically beer money. In my first startup 5 years ago, we paid over six figures up front to buy our own servers. Today, you could launch for 1000s if not 100s of dollars a month. Beyond the initial phase where we can quickly test our ideas, the Cloud also allows us to scale our application and business if the architecture is designed well. With proper planning and frameworks like Ruby on Rails, entrepreneurs can give birth to ideas and help them mature into a business or die a quick death. Yes, “fail early and fail often” is brought to you inexpensively by the Cloud.

The Cloud means progress for engineers and entrepreneurs. It can mean the difference between evaluating 1 or 10 ideas a year by reducing development time and costs. With a potential order-of-magnitude boost in the evolution of ideas, we are moving forward indeed.