HighLoad++ is one of the most anticipated events in the world of highly scalable computing. This year, 1,200 developers attended the conference. Ivan Pogudin, Thumbtack’s architect, was one of those representing Thumbtack at the Moscow event. Here are the highlights of his experience.
“Mature Optimization by Carlos Bueno” from Facebook
Although the topics raised by the speaker may have been obvious, they were still useful for prodding your memory and getting you thinking. The keynote was on the question of the need for optimization. There’s no denying the fact that it is extremely hard to organize an incremental optimization process on a big project. There’s always the trap that after optimizing one part, another part then becomes affected. In most cases, optimization of an immature project is a waste of resources. Code which was written only a month ago has a high probability of being changed in the near future. Whereas, old code is rarely changed. Another risky factor is the likely commercial failure of a project. So is it worth going for optimization? When is the proper time to optimize? Carlos’s answer: “It is all about experience”.
“Distributing Systems in Scala with Finagle” by Julio Capote from Facebook
This lecture would have been very interesting for people not familiar with Scala and Finagle. Unfortunately, I am not one of those and thus didn’t learn anything. However, I don’t feel I wasted my time. It’s always a pleasure to listen to a good speaker. And if you’ve never heard of Finagle, I strongly recommend looking into its basic concepts.
“Intergalactic Dataspeak” by David Fetter from Disqus
David talked about new and planned PostgreSQL features related to foreign data sources access (FOREIGN TABLE and DATA WRAPPER). In fact, this is an evolution of the dblink idea which materialized in SQL syntax extension. It’s also now possible to access very different data sources (http://wiki.postgresql.org/wiki/Foreign_data_wrappers). The basic concept of development is “SQL everywhere”. It is undoubtedly a viable idea, although it comes with a lot of limitations. David also presented an extensions catalogue for PostgreSQL: http://pgxn.org/. When asked about the stability of extensions, David replied, “It’s all in your hands,” which definitely sounded like the Open Source way.
I can’t say I was disappointed, but it would have been nice to have heard something about Redis, Cassandra and Storm usage in Disqus, rather than just how to use them via SQL.
“How we persist 60k events per second” by Arsen Mukuchyan from Adriver
Arsen told us a story, with a nice, happy ending, about building a bicycle. His company implemented a simple distributed data storage with index support and cross-node synchronization. The cluster structure is hierarchical in a way that a number of data replicas depend on age. The older the data is, the less replicas are maintained.
As a result, by minimizing a number of functional layers in the solution, Adriver significantly reduced hardware requirements.
*”simple” – of course, in the context of distributed systems, because otherwise it is not the simplest task to implement this kind of storage.
“Review of popular modern disk data storage algorithms: LevelDB, TokuDB, LMDB, Sophia” by Konstantin Osipov from Mail.ru, Tarantool
This was a remarkable speech about database fundamentals. Konstantin listed several generations of database engine algorithms – B-tree, LSM-tree (LevelDB, Cassandra), cache-oblivious arrays (TokuDB) – and also mentioned simpler practical implementations – Riak (Bitcask), Sophia. He also didn’t forget to add in their pros and cons.
The MySQL-MariaDB story by Michael Widenius from Monty Program Ab
That Monty – the database developer who started out in 1981, founded MySQL and then sold it for a billion dollars. Some time after that deal he became disillusioned with the new model of development and decided to split. That’s the advantage of Open Source: after the sale, the only thing he lost was a name. Michael introduced us to the latest features of MariaDB, which I won’t elaborate on in this article as they can be found in the change logs. He also talked about some changes that haven’t been released yet. The most remarkable of which are: new client libraries for main languages, multi-master support for one slave, storage engine for Cassandra, TokuDB, LebelDB, Connect, and many others. It’s worth mentioning that Monty has been working on monetization, and promises not to release commercial versions. Man, does he embrace the spirit of Free Software! We thank him for that!
“Query optimizer in Maria DB 10.0 – now without indexes” by Sergey Golubchik from Monty Program Ab
Sergey’s talk was about the new feature which will be in the next release of MariaDB 10.0 – engine independent statistics for query optimizer. Before, optimizer used to count the number of concrete values in a field with only index present. Now, it’s possible to have statistics for simple fields. Furthermore, all engines adjusted statistics are able to select a specific execution plan which is optimal for the engine. That gave an indeterminate result of execution in the case of cross-engine queries. This will be fixed In MariaDB 10.0. I would like to also thank Sergey for the excellent examples he included in his talk.
These were only the most interesting and memorable sessions from the conference. There were many others. So many, in fact, that it would be impossible to talk about them all in a single article. But if you’re interested in hearing more, you can order a video of all the sessions.
Overall, conferences are about the atmosphere. You could go off and learn about the different topics on your own without too much effort. But you don’t get the same emotional energy from a screen as you get from a room full of enthusiastic, engaged individuals. And for that, Highload++ 2013 was an excellent experience.
For more information about the conference, visit: http://www.highload.ru.