An Erlang journey, from start to... finish?


It's been more than a year since I wrote the last Erlang lines that went to production in a Social Point project and I thought that it would be cool to share the experience, so here it is. Also please keep in mind that this is my humble opinion, not from my employer.

First, we started with Erlang because we needed a high-concurrent platform to build a chat and some other real-time features for our games. Like many, did our research and found and studied ejabberd, but we found it too complex for our needs and decided to build it by ourselves from scratch.

So we went for it and invested a lot of time learning the things that surround the amazing world of distributed systems, concurrency and Erlang and OTP (the Open Telecom Platform): the behaviours, the BEAM VM, fault tolerance and the let it crash philosophy, functional programming, recursion, pattern matching, the concurrent programming model, the Actor Model VS the CSP model, the supervision tree, the benefits of message passing and immutability, distributed systems theory topics (high availability, replication, eventual consistency, CRDTs, consistent hashing rings, distributed hash tables, distributed process registers...), clustering BEAM VMs and epmd, benchmarking, OS tunning, compile/release/deploy strategies, dependency managers, remote debugging, learn and use amazing libraries (cowboy, ranch, riak core...), read some books and articles, went to a couple of Erlang User Conferences, did an in-house workshop with Loïc Hoguin...

Screenshot 2016-05-16 16.28.21

Lots and lots of really interesting and useful things there.

And the services we built worked rock-solid, no question.

But we found that to reach a good development speed and time-to-deploy a lot work had to be done, maybe too much. We found ourselves like we had hit a wall somehow. And we felt that it was not easy to introduce Erlang in other backend projects, there was too much friction. We couldn't engage the development team neither the systems team. We were fighting too much. And now I think it's somewhat related to the next tweet I posted the other day from the distributed framework Microsoft has built:

[tweet id="731560538543669250" align="center"]

Are you a distributed systems expert? Maybe, but I know certainly I'm not.

Sometimes it felt that one had to be almost an expert (maybe not a distributed systems one, but an expert nevertheless) to succeed in Erlang development. We wanted our whole development team to participate in this journey, but found that the developer experience wasn't very good, the documentation looked somewhat old, the syntax wasn't familiar, the open source ecosystem and libraries are very small, almost no one knows how to put a serious Erlang project in production, we couldn't find developers in Spain.... It was like we had a very powerful tool but that's very difficult to master. Remember the concepts I wrote in the second paragraph? Certainly they are not something that your learn in a week or a month.

So we tried, succeeded in some aspects, failed in others, and moved on.

As I said, learning Erlang brought a lot of knowledge and experiences about concurrency and distributed systems and many other things. We knew we wanted to move from some parts of our software from a stateless, request/response based, monocore model in PHP to a stateful multicore model and bring those big performance improvements we saw in some projects. We knew we wanted to implement new services with soft real-time capabilities because in game development this paradigm is key in making awesome game experiences.

And this is were Golang matched and now is succeeding in our company, because it was designed to be rapid: rapid to learn (the language syntax is super simple), rapid to develop (amazing standard library), rapid to build and deploy (compile times, no VM, just package and copy), rapid to execute (the performance is stunning) and rapid evolution (vibrant ecosystem, new release every 6 months).

We have been writing Golang code for almost a year and we are very happy about our development speed. We know it's not Erlang, the great platform which has been battle-tested for 30 years in production by some of the biggest telecoms, we don't have OTP and distribution and fault-tolerance for free so we have to do it ourselves if we want those features... but this is fine. Like everything, all decisions have trade-offs and we have decided to change, and in our experience, for the better.

Going back to Erlang, it seems that lately (i mean the last few years, since Erlang is from 1986!) Ericsson has taken more seriously some of these DX issues. They moved the code to GitHub and are more open to user's feedback, releases are faster and bring many goodies (maps, several performance improvements in many areas, new FSM behaviour)... Looks like exciting times are comming for the Erlang community. Maybe it's because the Elixir project, a Ruby-inspired language that compiles to the BEAM VM, is getting some traction lately and I think is trying in some ways to fix some of these Erlang developer experience issues. Or maybe not, anyway we all are lucky that it's happening. I haven't studied it deeply but it looks promising and its documentation looks very good. And the latest ElixirConfEU '16 videos look very interesting.

Finally I hope some day I can hack again in Erlang in my work time, because it's been an amazing experience and I think we are far better software engineers thanks to diving in it.