Jump on the real-time web with RethinkDB

Episode #65, published Wed, Jun 29, 2016, recorded Wed, Jun 22, 2016

Episode Deep Dive Transcript

Long gone are the days of the web acting as just linked documents and glorified brochures. Web apps of today are just that, rich interactive applications. But unlike desktop apps of old, these are apps with 100,000's or even millions of concurrent users.

We expect that these apps will instantly reflect changes to the data, potentially made by any of the users connected to the system while we are using them.

This has put a strain on the web servers, databases, and architecture of our web apps. Technology has responded by delivering amazing real-time capabilities with things like websockets and SignalR at the client layer and event driven systems on the web servers. But what about the database? Could it be events all the way down?

That was the goal of RethinkDB's cofounders when they pitched it to YCombinator.

Links from the show:

RethinkDB: rethinkdb.com
RethinkDB on github: github.com/rethinkdb/rethinkdb
Rethink on Twitter: @rethinkdb
Slava on Twitter: @spakhm
horizonjs on Twitter: @horizonjs
Quickstart: rethinkdb.com/docs/quickstart
Horizon.js: horizon.io
Horizon cloud: horizon.io/cloud

Episode Deep Dive

Guest Introduction and Background

Our guest for this episode is Slava Akhmechet, co-founder of RethinkDB. Slava grew up in Ukraine and got into programming at a young age, tinkering on a ZX Spectrum that connected to a television and used cassette tapes for storage. He later pursued graduate studies focused on distributed systems and high-performance computing, which eventually led him to co-found RethinkDB. Slava and his team went through the Y Combinator startup accelerator in 2009, and they invested years into building a database solution that emphasizes real-time updates and ease of use for developers.

What to Know If You're New to Python

Basic Understanding of Python Syntax: Comfort with Python’s basic data types, loops, and functions will help you follow the driver examples (e.g., import rethinkdb).
Package Installation: Familiarity with installing packages, such as using pip install rethinkdb, is helpful.
Running a Local Server: You’ll need to run or install RethinkDB locally (or in a container) before you can query it from Python.
Event-Driven Concepts: Even if you're just getting started, understanding Python’s approach to concurrency or event loops (such as asyncio) can help you appreciate real-time data updates.

Key Points and Takeaways

Why RethinkDB Focuses on Real-Time
- RethinkDB was designed to push data changes to the application layer without repeated polling. Slava explained how traditional databases rely on request-response, but RethinkDB implements “push” queries (changefeeds) where the database notifies the client about changes in the data.
- This model is especially valuable for collaborative apps, live dashboards, or chat systems that need continuous updates.
- Links and Tools:
  - RethinkDB main site
- RethinkDB GitHub repo
Origins and Y Combinator Story
- Slava and his co-founders pitched the concept of a real-time, event-driven database to Y Combinator in 2009. At that time, they saw a gap: many web technologies and frameworks had real-time capabilities, yet databases remained request-based.
- The team’s vision was to make real-time “the default” for web apps, which strongly influenced RethinkDB's design.
- Links and Tools:
  - Y Combinator
Core Features: Document-Oriented Yet Supports Joins
- Unlike some NoSQL databases, RethinkDB supports distributed subqueries and joins, making it more flexible for complex queries.
- It stores data as JSON documents, aligning well with modern app development while also offering powerful features often associated with relational databases.
- Links and Tools:
  - Official RethinkDB documentation
Open Source and Building a Business Around It
- All of RethinkDB’s core code is open source and available on GitHub. Slava described the philosophy that everyone in the community, including the developers at RethinkDB, is on equal footing.
- Their business model primarily centers on support and services for large-scale deployments. Operational complexities, like sharding and encryption, drive enterprise customers to purchase professional support.
- Links and Tools:
  - RethinkDB on GitHub
Developer Experience and Ease of Use
- RethinkDB ships with a streamlined administration console that offers a visual interface for cluster management, data exploration, and real-time logs.
- The team put significant effort into designing a query language (ReQL) that feels natural in whichever language (e.g., Python, Node.js, Ruby) you use.
- Links and Tools:
  - RethinkDB ReQL documentation
Horizon.js: Simplifying Real-Time App Development
- The conversation covered Horizon.js, a “prefabricated backend” that allows front-end developers to connect directly to RethinkDB without writing their own server.
- Horizon handles authentication, data synchronization, and real-time updates by default, making it popular among developers who want to avoid heavy back-end coding.
- Links and Tools:
  - Horizon GitHub (archived)
Horizon Cloud for Deployment
- For large-scale use cases, Horizon Cloud is an offering to help you deploy your Horizon-based real-time applications on cloud infrastructure. While still in a private beta at the time of the recording, it aimed to automate the complexities of scaling and uptime.
- This was an extension of the operational expertise the RethinkDB team developed while supporting enterprise customers.
- Links and Tools:
  - Compose.io (Now part of IBM Cloud)
Comparing MongoDB and Traditional Relational Databases
- While both RethinkDB and MongoDB are document-oriented, RethinkDB’s real-time “push” architecture sets it apart. Slava also explained how RethinkDB can handle joins more natively than MongoDB.
- In many enterprise settings, RethinkDB can coexist alongside relational databases, each covering different workloads.
- Links and Tools:
  - MongoDB main site
Security and Enterprise Features (RethinkDB 2.3)
- RethinkDB 2.3 introduced user accounts, encryption in transit, and official Windows support. These features, while less flashy, are critical for enterprise adoption.
- The conversation highlighted how large organizations typically require a strict set of compliance and security capabilities before they can adopt new data technologies.
- Links and Tools:
  - RethinkDB Release Notes
Community-Driven Development

Slava emphasized that design discussions and new feature proposals happen openly on GitHub, allowing community members to influence the direction of the database.
The RethinkDB community is known for being welcoming, with official and user-led meetups, Slack channels, and thorough documentation.
Links and Tools:
- RethinkDB community info

Interesting Quotes and Stories

"We never thought about how long it will take. We mostly looked at, hey, what is this going to make possible?" -- Slava, on building RethinkDB from scratch

"One of the biggest design goals we had was to make [RethinkDB] very, very easy to use. Every extra step a developer has to go through cuts our user base in half." -- Slava, emphasizing the importance of user experience

Key Definitions and Terms

Real-Time Database: A database that pushes updates to the client as soon as the data changes, rather than waiting for a client to re-request.
Changefeeds: The mechanism in RethinkDB that allows the client to subscribe to queries and be notified of data changes.
Joins: A way to combine data from multiple documents or tables in a single query, commonly known in relational databases.
Horizon.js: A server and framework on top of RethinkDB that lets front-end developers build real-time apps without writing a custom backend.
Document Database: A NoSQL database model that stores data in JSON-like documents, rather than rows and columns.

Learning Resources

Python for Absolute Beginners (Talk Python Training) training.talkpython.fm/courses/explore_beginners/python-for-absolute-beginners Great if you’re just starting with Python.
MongoDB with Async Python (Talk Python Training) training.talkpython.fm/courses/mongodb-with-async-python-beanie-and-pydantic While focused on MongoDB, this course covers general NoSQL concepts, async data flows, and can inform your RethinkDB usage.
RethinkDB Official Docs rethinkdb.com/docs The best place to get started hands-on with RethinkDB itself.

Overall Takeaway

RethinkDB’s story offers an inspiring look into how a small team tackled the challenge of bringing real-time, event-driven workflows down to the database layer. Slava and his co-founders believed that modern apps shouldn’t stop at real-time UIs, they wanted real-time at every level, including data persistence. By focusing intensely on developer experience, building an open-source community, and offering a powerful yet approachable toolchain, RethinkDB has carved out a notable niche. Whether you’re building collaborative tools, dashboards, or chat apps, RethinkDB’s push-based architecture and easy-to-use features can simplify your stack and supercharge your real-time applications.

Episode Transcript

Collapse transcript

WebVTT format On GitHub

00:00 Long gone are the days of the web acting just like linked documents and glorified brochures.

00:04 Web apps of today are just that, rich interactive applications.

00:08 But unlike your average desktop app, these are apps with hundreds of thousands or even millions of concurrent users.

00:14 We expect these apps will instantly reflect changes to the data potentially made by any of these many users connected to the system while using them.

00:21 This has put a strain on the web servers, databases, and architecture in general of our web apps.

00:27 Technology has responded by delivering amazing real-time capabilities with things like web sockets and signal R at the client layer and event-driven systems on the web servers.

00:36 But what about the database? Could it be events all the way down?

00:40 That was the goal of RethinkDB's co-founders when they pitched it to Y Combinator.

00:44 Now it's time to hear the story of RethinkDB with Slava Akshmet.

00:48 This is Talk Python To Me, episode 65, recorded June 22, 2016.

00:55 Welcome to Talk Python To Me, a weekly podcast on Python.

01:21 The language, the libraries, the ecosystem, and the personalities.

01:24 This is your host, Michael Kennedy.

01:27 Follow me on Twitter where I'm @mkennedy.

01:29 Keep up with the show and listen to past episodes at talkpython.fm.

01:32 And follow the show on Twitter via at Talk Python.

01:35 This episode is brought to you by Hired and SnapCI.

01:39 Thank them for supporting the show on Twitter via at Hired underscore HQ and at Snap underscore CI.

01:45 Slava, welcome to the show.

01:48 Thank you. It's good to be here.

01:49 Yeah, it's really excited to talk about my favorite kind of databases, which are document databases and NoSQL databases.

01:58 So we're going to dig into RethinkDB today.

02:01 And you guys have such a cool story and such a cool community that you built up.

02:05 And we'll get into that.

02:07 But before we do, let's start at the beginning.

02:09 What's your story?

02:09 How do you get into programming?

02:10 Well, so I was born in Ukraine.

02:12 And when I was maybe seven or eight years old, my parents got me a machine called ZX Spectrum, which was this tiny computer that plugged into a TV and a cassette player.

02:22 And they got it for me to play games.

02:23 But when you booted it, it had a basic interpreter.

02:26 And somehow, I don't know what happened, but the basic interpreter was more interesting to me than the video games.

02:32 And I just kind of started learning basic and learning, you know, learning how to type commands and what they do.

02:39 And eventually I wanted to make games.

02:42 So, you know, I used the basic interpreter's graphing library.

02:46 And it was really slow.

02:47 So that's how I got into assembly because I realized I could speed things up.

02:51 And, yeah, and that was the beginning.

02:54 That's a bit of a jump, right?

02:56 It was.

02:56 It was a bit of a jump.

02:58 And at the time, like, there was no internet.

03:00 So I had to work off of a manual that was kind of printed on a criminal printer and half the stuff was missing.

03:06 But that's how I got into it.

03:08 And I just never stopped.

03:09 Yeah, that's awesome.

03:10 I've also made that jump from high-level languages to assembly for a little bit of time.

03:16 And, yeah, I kind of left that behind.

03:19 But those were basically the options in the early days, right?

03:22 Basic and assembly.

03:24 You could pick your extreme and go to it, I guess.

03:27 Yeah, on the ZX Spectrum, there were really two options.

03:30 It was the built-in basic interpreter and assembly.

03:32 There were no compilers.

03:34 Like, you couldn't get.

03:35 I don't think you could get a different language on it.

03:38 So, yeah, it was just those two things.

03:41 And I figured out how to mix and match them and, you know, do all kinds of cool stuff on that machine.

03:46 Yeah, I probably learned, like, 75% of everything I know now about programming, just, like, tinkering.

03:54 It was the ZX Spectrum.

03:56 And it was the kind of a computer that's similar.

03:58 I guess it's similar to the Atari.

03:59 But we couldn't get that in Ukraine.

04:01 So, you know, I was stuck with that, like, four kilobyte of memory, you know.

04:06 I couldn't save anything.

04:07 Couldn't save anything.

04:09 So that's how I got started.

04:10 Wow, that's cool.

04:11 You said it connected to a television.

04:13 What do you think the resolution in pixels of that thing was?

04:16 Oh, man, so I don't remember, but I think it was, like, 120 pixels vertical or something like that.

04:24 It was really small.

04:25 Probably.

04:26 Or, you know, it didn't feel that way at the time.

04:28 No, it felt amazing at the time, right?

04:31 It felt amazing.

04:31 Yeah, it was amazing.

04:32 I'm sure it wouldn't be fun programming it now.

04:34 Yeah, it's probably a thing better left to nostalgia.

04:38 Yeah, I was actually thinking about buying it on eBay.

04:41 Oh, nice.

04:42 Because they're available and they're pretty cheap, but I never actually got around to it,

04:45 and I don't know if I ever will.

04:46 Yeah, nostalgia.

04:49 How interesting.

04:50 So let's talk about something a little more modern.

04:53 You built a pretty amazing database, a document database called RethinkDB,

04:58 and your tagline is it's a database for the real-time web.

05:03 What is RethinkDB?

05:05 Well, so RethinkDB is a NoSQL scalable real-time database.

05:09 When you start out using it, it's fairly similar to MangaDB in the sense that you could store

05:14 and retrieve JSON documents.

05:15 You could scale it out to multiple machines or multiple data centers.

05:19 But that's pretty much where the similarities end.

05:21 What's unique about RethinkDB that you can't get in any other database that I know of right now

05:28 is that it's designed for real-time applications.

05:32 And the way that it works is in a traditional database, you send a query and get a response,

05:36 and then you're done.

05:37 And if you want an update, you have to send another query.

05:39 And in RethinkDB, you can subscribe to queries.

05:43 So you could say, I'm interested in the top 10 selling books in my bookstore.

05:47 And then any time the data in the database changes in the way that updates that results,

05:52 the database pushes the notification to the application.

05:55 So if you're building any kind of a collaborative application or a multiplayer game where things

06:00 change all the time or some analytic software where you need updates right away,

06:05 RethinkDB is a really, really great database for that.

06:08 And it's unique, I think, because no one else does that.

06:10 Where your application is designed around this event-driven model where anytime something

06:17 changes in the database, you get an event saying, hey, the relevant results that you're interested

06:22 in are now different.

06:23 And that makes it dramatically, dramatically easier for people to build real-time apps.

06:28 Yeah, that's really cool.

06:29 And that is quite a unique capability.

06:32 I'm quite sure that MongoDB does not have that.

06:35 You've got to ask the question over and over to get new answers.

06:39 In modern databases that aren't RethinkDB, you can kind of fake this by doing various things.

06:45 Like you could poll and you can ask the same question over and over again.

06:49 You could subscribe to the replication log.

06:52 There are things users can do to get a semblance of this functionality.

06:56 But it's really different when it's baked into the product or project on day one.

07:02 Because it just opens up a whole world of possibilities.

07:05 And when it's a higher level feature that you can run on most queries or almost all queries,

07:10 it opens up so many possibilities that you can't really do by faking it in other systems.

07:16 Yeah, I totally agree.

07:17 That sounds great.

07:18 I recall you telling a story.

07:21 I heard it somewhere.

07:23 You sort of talked about the progression of real-time systems, right?

07:29 So we have the front end being real-time with things like Node.js and SignalR and some of these things that happen on the server.

07:37 You can get real-time stuff there.

07:39 But then it kind of stops.

07:41 Can you maybe tell that story?

07:42 I think that'll help really cement what your value proposition is.

07:47 Yeah, so traditionally web applications were built around request response.

07:51 Like that's how HTTP works, right?

07:54 You type in a web address in the browser.

07:56 The browser sends a request to the web server.

07:58 The web server sends a request to the database to get some information.

08:02 The database responds.

08:03 The web server responds.

08:04 And then you render the page.

08:05 So that's how web apps were built traditionally.

08:08 And then things started changing because people realized that to have really immersive experiences,

08:15 they have to push information to the browser without reloading the whole page, refreshing the whole page.

08:20 So what started happening is that people started building applications around push functionality.

08:25 So JavaScript is fundamentally event-driven.

08:28 And people started building front-end frameworks like Angular and React around the idea of events and event-driven programming.

08:35 And then as that became available, you needed that on the back end.

08:40 And we have things like SignalR or WebSockets to allow push to the browser.

08:44 So on the back end, Node.js is this fundamentally event-driven programming model where you can respond to events.

08:51 But then when you get to the database, it still requests response.

08:54 And the idea behind RethinkDB was, well, let's make a full stack event-driven programming, environment programming model,

09:02 where you don't stop at Node.js, but the database itself is event-driven,

09:06 where you say, I'm interested in these things.

09:09 And then anytime something changes, we fire an event.

09:12 And then Node.js processes that event and pushes that to the browser.

09:15 And then the browser processes that event.

09:16 So what that does is it gives you a complete full stack event-driven programming model.

09:21 And now you don't have to fake events when you go from the Node.js layer to the database layer.

09:27 The whole thing is event-driven.

09:28 And that makes just real-time architecture so much easier to deal with and opens up so many possibilities for building apps

09:35 because things that took a week now take a couple of hours.

09:39 Yeah, that's really cool because those are hard problems to solve.

09:42 And if you can just plug the pieces together, that's wonderful, right?

09:46 Yeah.

09:47 So for all of these problems, you can kind of hack around them.

09:49 And people have done that for a long time.

09:51 I mean, people build real-time event-driven applications before everything could be existed.

09:55 But hacking around this problem, it takes a long time.

09:59 People solve the same problem over and over and over again.

10:01 It's really hard.

10:03 You kind of need the main expertise.

10:05 So you need a team of people that really understand this problem and got burned a couple of times by trying to solve it.

10:10 And the idea was like, hey, let's just abstract this so people never have to deal with this again.

10:14 Yeah, that's great.

10:15 Can you give some examples of some apps that people have built or the types of apps people have built?

10:20 Yeah, so RethinkDB has a pretty huge community.

10:22 So we have about 200,000 developers building applications on it now.

10:26 And it's doubling about every three to four months.

10:29 So people have built all kinds of apps.

10:32 People have built games.

10:34 People have built analytics apps.

10:36 Just anything you can imagine that can be built from mobile or the web.

10:39 One of our big examples is Fidelity Investments, which is a big investment firm, built a mobile app for their users to manage their accounts in real time.

10:49 So you open the Fidelity mobile app.

10:52 All of that is backed by RethinkDB.

10:54 I think that's probably the biggest use case I can talk about where RethinkDB backs basically like 40 or 50 million users that Fidelity has.

11:03 Some others like Rethink is used by NASA for some real-time updates on what happens with extravehicular suit activity.

11:11 So every time astronauts go on a spacewalk, there's a bunch of data that's being generated, being processed in real time.

11:17 So it's all over the place.

11:19 People use it for financial applications, for just web apps, for collaborative apps, and for things I personally wouldn't imagine like spacesuit activity.

11:29 Yeah, that's really awesome.

11:30 I would have never thought spacewalks would be one of the use cases, but yeah, that's really cool.

11:36 Yeah, we definitely did not design it with that in mind.

11:40 I would never think about that in a million years.

11:43 Let's see.

11:44 If we set up a VPN from the space shuttle, then...

11:47 No, just kidding.

11:48 Yeah, no, it's an SSH connection, probably.

11:52 Yeah, probably.

11:53 Okay, so let's talk about the origins of RethinkDB.

11:58 What gave you the idea that, hey, we should create our own database?

12:03 Like, that's pretty daunting, right?

12:04 Before we started, I was in grad school, and I was actually working on mammalian brain simulation on supercomputers.

12:12 So I knew a lot about just distributed computing, and I knew a lot about the idea of, like, real time.

12:19 Because in brain simulation, you generate a lot of data that you have to send between multiple machines,

12:25 and there are a lot of challenges in figuring out how to parallelize all that stuff, because the network is the bottleneck.

12:30 And my co-founder, who was also at the same university, he was more of a human-computer interaction expert.

12:36 Like, he basically was really interested in user interfaces.

12:39 And when we met, we just kind of got together, and we spent, like, hours and hours talking about computing

12:44 and the future of software and where things are going.

12:46 And it was obvious to us that the world is really becoming more real-time.

12:51 Like, there's a ton of data being generated.

12:53 We generate more and more and more data every day, probably exponentially.

12:56 And it was very clear that, like, these static user interfaces just aren't going to last very long,

13:03 and everything's becoming real-time.

13:05 So we looked at how applications are being built, how they're being deployed.

13:08 And it occurred to us that, like, there's a lot of innovation around real-time almost everywhere in the stack except the database.

13:16 And it was just the part we thought was very important to build to advance this mission of kind of unlocking real-time apps for anybody who wanted to do it.

13:25 Like, what we want to do, and it's our goal even now, is to make real-time be the default.

13:31 So when you build an app, it should be real-time by default, not static by default, and then people have to do a lot of work.

13:37 So that's how we got started.

13:38 And, yeah, building a database is a daunting process.

13:40 It's a very hard problem.

13:41 It's a lot of work.

13:42 But we were excited about, you know, we kind of, we didn't look at the downsides.

13:47 We mostly looked at, hey, what is this going to make possible?

13:50 And it was just really exciting.

13:52 So we never thought about how long it will take.

13:54 Yeah, sure.

13:55 And people are successful, right?

13:56 They have this dream and this vision, and it's like, we're going to make it through the challenging bits, and we're going to create this thing, right?

14:03 So it sounds like you were really driven to sort of solve that real-time problem.

14:07 We were.

14:08 I mean, to be honest, we also thought it would be easier than it was.

14:11 And we thought it would take a lot less time than it did.

14:15 So a lot of it was naivete, but it served us pretty well, I think, in hindsight.

14:19 Yeah, maybe it didn't let you get through.

14:23 Otherwise, it would have been like, well, that's too much work.

14:25 Forget it.

14:26 That's cool.

14:26 And one of the things you did pretty early on was you went through Y Combinator, right?

14:30 The accelerator.

14:31 Yeah, we did.

14:32 Cool.

14:32 So what was that experience like?

14:34 What year was that?

14:35 So we went through Y Combinator in the summer of 2009.

14:38 And at the time, me and Michael, my co-founder, we were in New York, and we were in school.

14:43 And we had this idea, and we applied to Y Combinator.

14:47 We got an interview.

14:48 So we flew to California.

14:49 The interview was like eight minutes.

14:52 We got in, and we knew we wanted to go to California and start a company.

14:57 So we just packed up our bags and moved here.

14:59 And Y Combinator for us was great because we were new to the startup world.

15:04 We kind of knew a lot about software, but we didn't know a lot about how to start businesses

15:08 and hire people and manage people and how any of that stuff works.

15:12 And Y Combinator would invite speakers who were successful in the startup world.

15:16 And every Tuesday, they'd have a new speaker who would talk about their story.

15:19 And just that process of listening to people who've built successful things before was extremely

15:25 useful because it got us in the mindset of what it takes to build something that people

15:32 really want to use, like build something that makes, that's valuable for a lot of people.

15:36 So that was great.

15:38 I mean, it was a phenomenal experience.

15:40 I'd recommend it to anybody.

15:41 Yeah.

15:42 I've never gone through an experience like that, but it sounds really cool and like it would

15:46 be very beneficial.

15:47 So, you know, I think when people are new and they're thinking about starting businesses,

15:53 they often think about the technology, right?

15:56 Like if we build this app and have the best technology, just this work, just the way that

16:03 it should solve whatever problem that solving, that's only like 30%, 20% of what it takes to

16:10 start launch and be successful with a technology business, right?

16:14 There's all the marketing and the growth hacking and the user outreach.

16:18 And like how much of those types of things did you learn at YC?

16:22 We learned a lot about that at YC.

16:24 It was, I'd say, so everyone who goes to YC is very competent technically.

16:30 And that was the whole premise of YC, that you get really technically competent people and

16:35 you teach them all these other things that they need to know to build something successful.

16:39 So technology itself, I mean, it permeated everything.

16:42 It was in the background of everything because obviously it's about tech companies, but it wasn't,

16:47 we didn't learn like how to program at YCM or anything.

16:50 It was all about how do you build a successful project?

16:53 How do you grow it?

16:54 How do you build a product that people really care about?

16:57 How do you think about markets?

16:59 Things like that.

17:00 It was very little of it was about the technology itself.

17:03 Like the whole thing was under the premise that technology is super important and technology

17:09 unlocks all kinds of possibilities for people.

17:12 But everyone at YC was already good at that.

17:14 So the bigger part, the much bigger part was around all these other things you mentioned.

17:18 Right.

17:18 Yeah.

17:19 So the technology is basically table stakes.

17:21 Like you don't even get there if you don't have that skill set.

17:25 But it's teaching you all the things you didn't maybe even know that you needed to know to

17:29 succeed.

17:30 Right.

17:30 Yeah.

17:30 I think the premise of YCM is if you take technically competent people, all this other

17:35 stuff, you can teach them.

17:36 Marketing and growth hacking and all these things.

17:39 Like it's hard, but it's not rocket science.

17:41 Like if you can build a compiler, you could probably figure out how to market your product.

17:46 That's the premise of YC.

17:47 And the idea was like, hey, there's a lot of information there.

17:51 It's not very hard, but it's hard to get.

17:52 So we're just going to teach people all this stuff and see what happens.

17:55 And I think the proof is in the pudding.

17:57 Like YC is really successful now.

17:59 This episode is brought to you by SnapCI, the only hosted cloud-based continuous integration

18:21 and delivery solution that offers multi-stage pipelines as a built-in feature.

18:26 SnapCI is built to follow best practices like automated builds, testing before integration,

18:31 and provides high visibility into who's doing what.

18:34 Just connect Snap to your GitHub repo and it automatically builds the first pipeline for you.

18:39 It's simple enough for those who are new to continuous integration, yet powerful enough

18:43 to run dozens of parallel pipelines.

18:45 More reliable and frequent releases.

18:47 That's Snap.

18:48 One of the cool aspects of RethinkDB is it's open source, right?

19:08 Like I can go to GitHub and get it.

19:10 I go to github.com/rethinkDB slash rethinkDB to get the database itself.

19:15 And it's really popular, right?

19:17 It has over 14,000 stars.

19:19 Yeah.

19:20 So you said that it was one of the biggest C++ open source projects on GitHub or something like that.

19:26 Is this true?

19:27 Yes.

19:27 It depends on how you measure.

19:28 So if you go to github.com/explore, I think they keep changing the interface.

19:33 I don't know if you can check this now.

19:35 But at least they used to be.

19:37 So you used to be able to look at trending projects.

19:39 I think that's still available.

19:41 And you used to be able to look at, like filtered out by stars, by language, by all kinds of things.

19:46 I haven't done this in a while.

19:47 But yeah, Rethink has been the biggest trending C++ project on GitHub for a long, long time.

19:53 Pretty much since we launched it, actually.

19:55 Yeah.

19:56 That's really cool.

19:57 When did you actually launch it?

19:58 You said you did Y Combinator in 2009.

20:00 When was there a thing that you let loose on the world?

20:03 I think it took about three and a half years to get the first version out.

20:08 My memory is a little hazy on this because it's been a while.

20:10 But yeah, it's a hard problem.

20:12 And it took a while to build the first version of the product.

20:15 How many people worked on it?

20:16 So right now we're 18 people.

20:18 But before the first version, I think the whole company was about seven.

20:23 Okay.

20:24 It's definitely a big project to undertake.

20:26 So yeah, very cool.

20:28 When I look at GitHub, it tells me that it's like 50% C++, 26% Python, some JavaScript.

20:36 What are the technologies inside?

20:39 Yeah.

20:40 So Rethink is pretty complex because it touches almost every part of the stack.

20:44 And on the bottom of the stack, it touches the operating system and all the APIs that we have to deal with.

20:51 To do disk access and network access and memory management and things like that.

20:54 So all of that stuff is done in C++ and even a little bit of assembly, actually.

20:58 And then around the core C++ database, there are a lot of technologies to connect it to users.

21:06 So the drivers are done in a lot of different languages.

21:09 Like there's a Node.js driver, a Python driver, a Java driver, Ruby driver, just drivers for all kinds of languages.

21:15 So that is written in whatever native language the driver is for.

21:20 Then there is a lot of code for testing different things, for testing the query language, testing the distributed system, all kinds of tasks.

21:28 So most of that is written in Python.

21:30 And then there's a lot of glue code.

21:32 You know, there are Bash scripts.

21:33 So we kind of, we try to minimize the number of languages and technologies that we use just to kind of keep it contained.

21:40 But it, so most of it is C++, but we really have, we have a lot of different things that we use.

21:46 I'd say the biggest ones are C++, Python, and Bash.

21:49 And then there are drivers for almost every language.

21:52 Right.

21:52 And you pretty much have to use those languages to write them usually anyway.

21:56 Yeah.

21:56 Okay.

21:57 Cool.

21:58 So when I think about NoSQL databases, you've got the document databases, you've got key value stores and those sorts of things.

22:07 It feels to me like document databases are the kind of database that you could use as your sole database for your app.

22:16 Right.

22:17 There's some things like Redis, DynamoDB, where you probably use it for like some particular use case, but it's not your only data store.

22:24 What do you think about this idea that document databases could replace your relational database rather than just be another thing that you use in parallel?

22:32 Well, so databases are fundamentally very horizontal technologies and you could use them.

22:37 They're applicable to a lot of different problems.

22:41 And I think that so relational databases used to be this Swiss army knife where you could use them for almost anything because most of the data used to be relational.

22:50 And now with modern apps, most of the data is not relational.

22:55 Most of the data is hierarchical.

22:57 There are a lot of fields missing.

22:58 You know, it's what we think of as document kind of base data.

23:01 So I think for most of the modern use cases, object oriented or document databases are really, really versatile and they could be used for almost anything.

23:11 Now there's still data being generated that's relational.

23:14 And just like with anything, like you could use relational databases to store document data.

23:19 You could use document databases to store relational data and it will work fine, but you'll have to hack around a bunch of problems.

23:26 So I think document databases will be used more and more and more as more data is being generated, more apps are being built, and those are fundamentally not relational data models.

23:35 So document databases are great for that.

23:38 I don't think they're going to replace relational databases personally because relational databases are just fundamentally better at storing relational data.

23:46 And, you know, some people decide to unify their infrastructure.

23:49 They figure, I don't want two pieces, two technologies managing things.

23:53 I want just one.

23:54 And they could put relational data in the document database and it will work just fine.

23:58 But it's not, you know, it's not ideal.

24:01 I think that you'd have to hack around a bunch of things.

24:04 There are some things you can't do.

24:05 So my view of the world is that people will continue using different specialized technologies for different use cases.

24:12 And there's always a balance.

24:14 Like if you use too many specialized technologies, it's too hard to manage.

24:17 If you use too few, you have to hack around a bunch of stuff and you have operational problems.

24:21 So there is a good middle ground there.

24:23 And that sort of feels to me like that's where the world is going.

24:26 That's what we see most of our customers use.

24:28 Okay.

24:28 Yeah.

24:29 That's a really interesting way to phrase it and to look at it.

24:32 So what kind of relational features does Rethink have?

24:36 Like does it have foreign keys?

24:38 Does it have joins?

24:39 Does it take transactions?

24:41 Does it take any of these kinds of things on?

24:43 Or does it lean more towards the MongoDB style where it foregoes those for other reasons?

24:49 Okay.

24:49 So RethinkDB does particularly distributed joins.

24:53 It does distributed subqueries.

24:55 So as far as the complexity of a given query, you can do almost anything in RethinkDB and sometimes more than you could in a relational database.

25:04 So we take on all of those challenges.

25:06 And particularly distributed joins are extremely useful in almost every use case, actually.

25:11 So we don't do transactions.

25:14 At least we don't do transactions for now.

25:16 There are some proposals and integrating them, but they're still on the drawing board.

25:20 We don't do foreign keys in the way they're understood in relational databases.

25:24 So for example, we don't do cascading deletes or things like that.

25:28 So RethinkDB is much, it has way more relational features than any other document database that I can think of.

25:36 Like certainly way more than Mongo.

25:38 But it's not quite as good for relational data as like say Postgres or Oracle or SQL Server or MySQL or one of the many relational database management systems.

25:49 Okay.

25:49 Yeah.

25:50 Very interesting.

25:50 What does it look like to use this?

25:53 And maybe you could give us, I don't know, an example from the Python driver, if you know it off the top of your head, or if not, just from any other of the drivers.

26:01 Like if I create a new project and I want to, I have Rethink running, how do I connect to it and get going?

26:08 Like what are the steps?

26:09 You don't have to say code exactly.

26:11 One of the biggest design goals that we had was Rethink is to make it very, very easy to use.

26:16 And we literally thought like every extra step that a developer has to go through will cut our user base in half.

26:23 So we just made it as simple as possible.

26:25 And we spent a lot of time doing that.

26:26 The easiest way, I mean, to figure it out is like if you go to rethinkdb.com slash docs,

26:31 there's like a 30-second tutorial that shows people how to use it.

26:34 But in Python, it's very simple.

26:36 You import RethinkDB, and it's just a Python driver for Rethink.

26:40 You say RethinkDB.connect, and then you run a query.

26:43 Like for example, it's a table users.insert.

26:47 You just put a document in there, .run.

26:49 That inserts the document into the database.

26:52 It's really, really simple.

26:53 It's just a couple of lines.

26:55 Yeah, and you guys have a nice fluent API where you say, I want to create a query, .filter by this, .limited by that, and order it by that, and so on, right?

27:03 You can chain them together in a really nice style.

27:06 Yeah, so that was inspired by jQuery.

27:08 So the goal of the query language was, first of all, to make it seem native in the programming language that people use.

27:16 So if you're using Python, the query language, I mean, it's just Python.

27:19 If you're using Ruby, it's just Ruby and so on.

27:21 And one of the biggest kind of challenges that people run into when they use SQL is if you look at Stack Overflow questions that people asked about SQL, they're kind of different from questions about any other programming language.

27:34 Like if someone's using Python and you look at Stack Overflow questions, the questions are like, I'm trying to do this.

27:39 It's not quite working.

27:40 I don't understand what's going on or how does this function work or things like that.

27:45 But in SQL, people ask weird questions like, I want to do this.

27:49 I don't know how to express it.

27:50 Like, how do I even do this?

27:52 Right?

27:52 And that's because SQL was designed for business analysts, and it's kind of like, it was sort of meant to be like English, but it's not really English because it's a programming language, of course, or a query language.

28:02 So it's kind of challenging.

28:04 And the goal with Requel was make it be, you know, really, really intuitive.

28:08 So you start out, and you can think of it as a data flow language, kind of like Bash and Pypes, where you start out with a table, and then you say,

28:15 I want to run this transformation, and then I want to run that transformation after that, and then you can keep changing things.

28:20 And it turns out to be really intuitive because people can write queries by just changing on things and seeing intermediate results and incrementally building up the query until they finally get to what they want.

28:31 So with Requel, it just becomes really, really easy to express what you want in a way that I think SQL could never allow people, just because of the fundamental differences in the design.

28:40 Yeah, it feels a little more written for programmers in a simple way rather than trying to create yet another language.

28:48 So yeah, that's really nice.

28:50 Yeah, for sure.

28:50 Yeah, that's really nice.

28:51 So one thing that you guys focus on a lot, and I think it feels like it's a little bit of the influence of your co-founder, Michael, is design and the way it feels to use software.

29:04 All of the design stuff and usability, that's mostly Michael's doing, and Michael really cares about the user experience.

29:12 And he takes it from, you know, I mean, it's not just how things look or how many steps you have to go through.

29:17 He thinks it through to the point of, like, what does the user feel when he interacts with a particular feature?

29:24 What does the user feel when he interacts with the product as a whole?

29:28 So he spends a lot of time just thinking holistically about the user experience and the kind of subjective experience the developers go through when they use the product.

29:39 And that permeates everything.

29:41 That permeates the query language.

29:43 That permeates, like, the install process, the art that goes into everything to be and goes into the documentation to make things easy.

29:51 You know, the website, the admin console, like, all of these things.

29:55 Yeah, I was thinking specifically the admin console because a lot of databases, the tools that use them feel not so great, right?

30:04 Like, sometimes you just have a command line interface to it.

30:07 You know, if you have a UI, it usually looks like it was created by a DBA.

30:13 You know, something like this, right?

30:14 It's not a great experience, but you guys have, like, a beautiful, simple-to-use web management interface for your databases, even for, like, sharding and replication and failover and all these kinds of things, right?

30:28 Yeah.

30:29 So our hypothesis was that, like, people spend, if you're running a web application, there's a lot you do on the front end, sometimes a lot on the back end.

30:39 But the database is at the core of the thing, so people spend most of their time writing database queries.

30:44 And if you think about, like, what's that like for the application developer?

30:49 Well, that means they spend, you know, six, seven, eight, sometimes more hours a day just dealing with the database product.

30:55 It's their life at work.

30:57 And it was very important that that experience is pleasant for them.

31:01 And that, for me, it's a lot of different things.

31:03 But one of the biggest ones is we thought that there needs to be almost like a development environment where people feel comfortable, that's easy to learn, easy to experiment with queries, easy to figure out what's going on, easy to test things, play around, easy to figure out what happens in your cluster.

31:20 So that was the goal of the web UI.

31:22 It wasn't an afterthought.

31:23 It was something we thought was very, very important for the people, for our users, right, for developers that are going to use Rethink.

31:30 Because they spend so much time in it every single day that not adding it or not building it and not thinking it through almost didn't really make any sense.

31:39 Like, we were really surprised that that doesn't exist in a lot of other database systems.

31:45 Because if you think about how much time people spend on them, that just seems like a crucial, crucial thing to do.

31:50 It would almost be like building a compiler without having a text editor.

31:54 Like, the text editor is a fundamental part of writing programs.

31:57 It's really, really, really important.

31:59 Like, you need a good compiler, but without the text editor, like, just the experience of using the compiler would be miserable.

32:05 Yeah, that's a good analogy.

32:06 It feels like you guys really focus on making, interacting with the whole system delightful.

32:13 Yes, that's very important.

32:14 Yeah, one of the things that really surprised me in a positive way was I heard that you guys have a full-time artist.

32:21 Like, if you look at your documentation, you have little, like, cartoon characters and stuff to make it feel friendlier.

32:28 You know, like, if, in particular, I have, like, rethinkdb.com slash docs slash quick start pulled up.

32:34 And there's, like, a little database walking up to this character.

32:37 Most database companies I think of are on the opposite spectrum of this type of experience.

32:45 This is really cool.

32:46 Yeah, definitely.

32:47 So, when we were first shipping everything, like, before the first version, we thought it was important to do just basic things that you do, like, you know, branding.

32:58 Just, like, kind of the things that every open source or every project does in general.

33:03 So, we hired an artist to do a lot of the work.

33:06 And Annie kind of came in and she did the original things.

33:10 But she also had an enormous amount of passion around art.

33:13 She brought that passion to the company.

33:15 And it became immediately obvious to, or, you know, very quickly it became obvious to Michael and I that Annie's work and her passion for the art can permeate a lot more than just the basic things, like, you know, the logo and some illustrations on the front page.

33:30 And she was very adamant about, like, hey, this could really change the experience of people interacting with the product.

33:39 And because Michael is a user-experienced person, like, he immediately grabbed onto this idea.

33:43 And Annie did a couple of things early on.

33:46 Like, she made the dog.

33:47 She made illustrations for the documentation.

33:48 And people started commenting on it.

33:50 Like, on Twitter, people would say, wow, that makes everything feel way more accessible.

33:54 She brought that passion into the company.

33:57 And then it was just obvious that it makes everything way better for our users.

34:01 And no one else does that.

34:03 So, it was also kind of differentiating for the company because people notice it.

34:06 She, like, added this whole new dimension of interacting with the software project that you don't often see in other projects.

34:13 Yeah.

34:13 I'm sure you guys are really delighted.

34:15 Like, wow, this really does make this so much friendlier.

34:18 It's cool.

34:20 I really like it.

34:21 It's a nice touch.

34:21 Yeah.

34:22 It's kind of unobvious at the beginning that that would matter.

34:25 But then when we did a couple of these things and, like, everyone started noticing and people started commenting, like, wow, this makes everything way more accessible.

34:31 We were like, yeah, we need to do more of this.

34:33 Yeah, for sure.

34:34 This portion of Talk Python To Me is brought to you by Hired.

34:49 Hired is the platform for top Python developer jobs.

34:52 Create your profile and instantly get access to 3,500 companies who will work to compete with you.

34:57 Take it from one of Hired's users who recently got a job and said, I had my first offer on Thursday after going live on Monday and I ended up getting eight offers in total.

35:05 I've worked with recruiters in the past, but they've always been pretty hit and miss.

35:08 I tried LinkedIn, but I found Hired to be the best.

35:11 I really like knowing the salary up front.

35:13 Privacy was also a huge seller for me.

35:16 Sounds awesome, doesn't it?

35:17 Well, wait until you hear about the sign-in bonus.

35:20 Everyone who accepts a job from Hired gets $1,000 signing bonus.

35:23 And as Talk Python listeners, it gets way sweeter.

35:26 Use the link Hired.com slash Talk Python To Me and Hired will double the signing bonus to $2,000.

35:31 Opportunity's knocking.

35:33 Visit Hired.com slash Talk Python To Me and answer the door.

35:43 Can you talk about the community around RethinkDB?

35:46 It's grown really quickly.

35:48 You've got a very passionate user base.

35:50 You guys do a lot to engage the community.

35:53 Can you talk about some of the things you do?

35:54 Yeah.

35:55 So the user community is one of those things that's also the core of the company.

36:01 We thought it's very important to do good community building and connect with our users.

36:08 And usually what happens with open source is people, they'll build up for the idea that people have about open source is you build a software and you make the source code available.

36:17 But that doesn't make a community happen, right?

36:19 And actually, most of the work is done by Michael.

36:22 And we have someone here, Christina Kielan, who does a lot of community management.

36:26 She's absolutely amazing at it.

36:28 And so the way we approach the idea of building a community is everyone who uses RethinkDB or contributes to it in any way is kind of equal.

36:38 And we, the employees of the company, we just happen to be paid for our work, but we're also just members of the community.

36:44 And what that means is it's not just about publishing the source code.

36:48 It's about doing everything openly so that users could communicate with us.

36:53 Like, for example, we do all design discussions on GitHub.

36:56 And it's, you know, our employees comment on features and design proposals and things like that.

37:02 But they do it with our users because anybody in the world who's using RethinkDB can go on GitHub and say, hey, I think this should be done this way.

37:10 People can contribute.

37:11 So the whole thing is done in the open.

37:12 And that is huge for fostering a good community because people feel invested in the project and they feel like, you know, their opinions are really going to be heard and that they can kind of drive the direction of the project.

37:24 They can drive the direction of the features, how they're going to be designed.

37:26 So that's one of the things that we do.

37:28 The art is really important.

37:29 We do a lot of local meetups.

37:31 We try to engage everyone on social media, you know, on Twitter, Facebook, things like that.

37:37 So community is just, it's about as fundamental to RethinkDB as the software itself.

37:43 And we take it really seriously and we think through a lot of the interactions, you know, how users feel, how they interact with the project.

37:50 So it's been a pretty big deal for us.

37:53 And at the beginning, like, we didn't know whether any of this was going to work.

37:56 But then the community grew really, really quickly.

37:58 And, yeah, it turned out to be really important.

38:01 That's great.

38:01 It's one of the really cool aspects of open source, right?

38:06 And you guys have a successful, thriving business based on a thing that I can go to GitHub, click download as a zip file or a clone.

38:17 And I have the product, basically.

38:20 And so I'm really fascinated and delighted when I see companies making successful businesses out of open source projects.

38:29 Can you talk about what it's like to run a business where the main thing you have is sort of given away or out in the open?

38:38 Yeah.

38:39 Yeah.

38:39 Sometimes I talk to my dad about that and he still doesn't understand how this works.

38:43 He's like, okay, so you give away the product for free and anybody can get it for free, right?

38:48 Like, how does that work?

38:49 Yeah.

38:50 And so what happens with, particularly with RethinkDB is our goal was to make it available to anybody who wants it.

38:59 And if the product is good and the world is really going in the direction of real time, then eventually RethinkDB will be in most of the development steps.

39:07 So what we wanted is to make it accessible to anybody who wants it.

39:10 Like if it's a student who's building a new project or experimenting with new ideas or a hobbyist, they should be able to get it for free.

39:16 But with a product like a database, it's very easy to run it on your laptop and you don't need to pay for it.

39:23 You don't need any support.

39:24 But if you're a big organization that's deploying RethinkDB across five data centers around the world,

39:30 there are enormous amounts of operational challenges that these companies have to deal with.

39:35 And they're pretty risk averse too.

39:37 So, you know, they can't, you know, if you're deploying a big application across the world, like it can't fail.

39:43 You have to make sure you do health checks.

39:46 You have to make sure everything works right.

39:48 So for companies like that, we sell support and services.

39:51 And most of the revenue comes from that.

39:54 And that wouldn't work for every open source project.

39:57 Because, for example, if you're selling like a developer tool, like a text editor, there is no operational component in it.

40:03 But for open source projects that have a big operational component, like it has to run 24-7, people take that very seriously and they buy support, you know, in big organizations.

40:13 So that's how the business works.

40:14 And it's not applicable to every open source project.

40:19 It's only applicable to open source projects where there's a big operational part.

40:23 If there isn't a big operational part, there may or may not be open source business there.

40:28 I don't know exactly.

40:30 But for us, it's the big, large-scale operational component that makes the financial aspect of the company work.

40:37 Yeah, that makes a lot of sense to me.

40:39 I guess if you're building a product where you draw the architectural diagram and your product is on the bottom or if it's like a hub and spoke and your product is in the center, that's a thing that can't fail.

40:52 And databases cannot fail, not in the sense that it has a bug or something.

40:58 But, like, if somehow it goes down or you can't connect for it or it doesn't replicate, like, all sorts of bad stuff happens when the data stops flowing, right?

41:07 So you're right that you're in somewhat of a unique place where this is really something that people depend upon.

41:14 Yeah, absolutely.

41:16 And another thing about databases is that Rethink is really easy to use, probably about as easy as any other project you can think of, but, you know, with the same complexity.

41:26 But it's also, at the same time, databases are fundamentally complex and distributed databases are even more so.

41:32 So we make it very, very easy.

41:34 I mean, you can distribute everything to be in the click of a button.

41:36 But if you have enormous amounts of data and many, many data centers, people will pay for support just for the safety of knowing that if something goes wrong, they can pick up the phone and their problem will be solved.

41:48 And it's really, really important because their businesses depend on it.

41:51 So, yeah, for distributed databases like Rethink, that's what makes the whole thing work.

41:55 There are examples.

41:57 So, for example, with, like, web servers, they're relatively simple and they're really robust now.

42:01 So you wouldn't necessarily buy support if you use, like, Nginx or Apache.

42:04 So Rethink is definitely in a unique position.

42:08 I wouldn't necessarily take this lesson and apply it to every open source project.

42:12 Right.

42:12 But if you think it through a little bit, there are a lot of projects where this methodology applies.

42:17 Absolutely.

42:17 Every project kind of has to find its way.

42:20 You know, I had Pablo Hoffman from the Scrapey, the open source web scraping project on.

42:27 And what he ended up doing to sort of build a business around the web scraping library was to create web scraping as a surface and have, you know, one click, push your code to the cloud and we'll handle all the infrastructure and management and scaling.

42:43 I mean, there's all these different ways.

42:45 And I just found that to be fascinating.

42:47 There's all these different ways in which you can do it.

42:49 But I think giving everybody in the Python community, the open source community examples and a bit of a look inside is really cool.

42:57 So you talked about having your database be in the whole, basically the interaction with Rethink in general, being as simple as possible and easier than everything else.

43:08 Do you ever feel like, I guess, is there a tension between, hey, we could make this even easier and, oh, but if that's easier, we might get less support calls about this.

43:22 Is that a tension that you balance or do you just always go for improving the product and then go from there?

43:29 So in practice, this turns out not to be a tension that's actually important.

43:33 Because if you look at, I mean, this isn't unique to everything to be.

43:37 This is pretty much any operational product like this.

43:41 If you look at where most of the revenue comes from, it comes from really, really big customers.

43:46 And if we think it doesn't matter how easy we make it used, we make it be like big customers will still have enormous challenges that very few people face.

43:55 It doesn't matter which product they use.

43:57 Like we make their challenges go away, but they still have to pay for support.

44:01 So the tension comes in play when you're talking about smaller customers.

44:05 But smaller customers don't really pay that much for databases anyway.

44:09 So, you know, if you're trying to maximize revenue, you can pretty much not worry about like really small businesses because they won't pay that much anyway.

44:18 And you can focus on selling, on making, you know, the commercial aspects of the project really compelling to big companies.

44:25 And for that, there is no tension between ease of use and revenue whatsoever.

44:29 For smaller businesses, yeah, there's a little bit of that tension.

44:31 But if you look at the numbers, it turns out not to be that important.

44:35 So we never really think about it.

44:36 We make it as easy as possible every time.

44:39 Cool.

44:39 That makes a lot of sense.

44:40 I mean, big companies just want reassurances, right?

44:46 When you're a 50,000 person business and the thing that you guys sell depends on the data that keeps flowing, it doesn't matter if you can push a button to like scale.

44:55 They want somebody that will take away the risk around that, right?

44:59 Yeah, it's taking on the risk.

45:00 It's also things like training.

45:01 Like, for example, most of our big customers, like the teams that interact with RethinkDB, you know, they can be up to 100, 150 people.

45:11 And it's not that 150 people necessarily like work with Rethink itself, but they're somehow related, you know, to the application or the operations of it or something that's relevant.

45:21 So you have to train.

45:22 If you think about it from a high level perspective, like you have to train 150 people to use this new product.

45:27 Like, yeah, they could learn it on their own.

45:29 They could go online.

45:30 They could read the documentation.

45:31 But we have structured courses where we can come in and we can teach people and get them up to speed very, very quickly and get them to be productive.

45:39 So it's big companies have a whole different layer of challenges that they face.

45:44 And they have usually more money than time.

45:46 Yeah.

45:47 So they're happy to trade one for the other.

45:48 And that's essentially what we do.

45:50 Sure.

45:50 If you have to get 100 people up to speed on something, training is so much more often the right answer to do it in a week or two rather than say, okay, everybody, go figure this out and we'll get back together.

46:01 Yeah.

46:03 Yeah.

46:03 So you guys just had a release.

46:06 What was it?

46:06 2.3 that you added a bunch of new features.

46:09 Do you want to highlight that?

46:09 Yeah.

46:10 So we do.

46:11 One of the things about our releases is we try to do frequent releases.

46:15 We try to release a new version of Rethink every two to four months, depending on how things go.

46:20 So the 2.3, we had user accounts, we had encryption, we added official Windows support.

46:25 So a lot of these features, they're a little bit boring in the sense that most of these were built for bigger companies.

46:30 Like as we get more and more big customers, they have demands that may not be necessarily important to developers that like download Rethink and try to build a simple app.

46:39 But yeah, RethinkDB 2.3 had a lot of features like that.

46:43 We're kind of targeted at big customers.

46:44 They were targeted at scale.

46:46 Things like encryption, compliance, like a lot of stuff like that.

46:49 In RethinkDB 2.4, which is coming out, we are adding a lot more things to the query language that are going to be really, really exciting.

46:57 We're adding real-time aggregations so people can do real-time analytics much easier.

47:04 We're kind of expanding Requel to support new terms.

47:07 There's going to be a lot of exciting stuff pretty much for everybody.

47:11 Okay.

47:11 That's awesome.

47:13 Yeah.

47:13 Some of those enterprise features, they don't make you jump up and down with excitement.

47:17 But that's critical to being adopted in these big companies.

47:21 And that's an important part of the business, right?

47:25 Yeah.

47:25 I mean, it's really important.

47:27 If you're running a database across multiple data centers over the internet, encryption is extremely important.

47:32 So we have to add that.

47:33 It's not necessarily that exciting, but it's kind of a showstopper for a lot of big companies.

47:38 Yeah.

47:39 Is there a Rethink as a service?

47:41 Like, can I go somewhere and pay $10 a month and have a small Rethink cluster I can work with or something like this?

47:49 Yeah.

47:50 So there are a couple.

47:51 The biggest one is actually done by IBM.

47:53 They bought a company called Compose.io.

47:55 You can go to compose.io.

47:57 And they host, actually, like most documentary-oriented databases.

48:00 So they do RethinkDB, MongoDB, Elastic, I think a couple of others.

48:05 It's pretty inexpensive to get started.

48:07 It's very cheap if you just want a single node.

48:09 And then they allow you to scale up pretty much as much as you want.

48:12 So Compose.io is probably the easiest way.

48:15 There are a couple of others.

48:16 And, of course, people can run it on Amazon themselves.

48:19 There's lots of different options.

48:21 But most users that want Rethink as a service use Compose.io right now.

48:24 Okay.

48:25 Yeah, that's cool.

48:26 Nice of that's out there.

48:27 So while we're talking, I'd like to talk about Horizon.js.

48:32 So tell me what Horizon.js is.

48:36 So Horizon.js is a new project that we just launched a couple of weeks ago.

48:40 It's built on top of RethinkDB, and it was an experiment that I actually think turned out to be really successful.

48:46 So what happened was the motivation behind Horizon, like what happened was a lot of users who were new to databases,

48:53 they wanted to build apps, mobile apps, web apps.

48:56 So they'd go on the bug tracker, and they'd say, hey, I'm trying to access the database from the browser, and it doesn't work.

49:03 And, of course, it wouldn't work because databases fundamentally have to be accessed from a back end.

49:07 And people kept asking this question, and we thought, hey, maybe we can make it easy to access the database from the browser.

49:13 So we built Horizon, which is a prefabricated back end, and what it lets people do is build JavaScript apps without writing any back end code.

49:21 So you can build a mobile app or a web app, and the back end, all of the back end is handled completely by the Horizon project.

49:28 You just start Horizon.

49:29 It's basically a server.

49:31 It connects to RethinkDB, and then all you have to do is write code on the front end,

49:36 and then all the back end stuff will be handled by Horizon itself.

49:39 So what that does is it makes building real-time apps dramatically easier, again,

49:44 because you don't have to write a single line of back end code.

49:47 And the goal behind Horizon was as people build more sophisticated apps,

49:51 they're going to, of course, need to write back end code.

49:54 So when the app gets complex enough, you can stop using Horizon as a standalone server and import it into Node.js

50:00 and start writing back end code and still use all of the Horizon services.

50:04 That's Horizon, and we didn't know how important it was going to be.

50:08 We wanted to try it.

50:09 It seemed like it would make a lot of people's lives easier.

50:11 And right now, so we launched it a couple of weeks ago.

50:13 I think right now Horizon already has a quarter of RethinkDB's user base.

50:17 So it turned out to be pretty successful, and it's growing.

50:21 It makes building things easier.

50:22 Yeah, people must have really been waiting for that.

50:25 So interesting that it basically, it's not so much a front end thing as it is a back end with an API

50:33 to alleviate the need for front end people to write back end code that they probably would rather not write anyway.

50:39 Yep, that's exactly what it is.

50:41 Horizon doesn't, so it does come with a small front end library, but it's meant to be used with React or Angular or one of the many, many front end frameworks.

50:52 We don't actually do very much front end.

50:54 It's just, it helps front end developers build apps without writing back end code.

50:59 That's kind of all Horizon is, but it's a lot.

51:01 Because it turns out that there's an enormous amount of plumbing that people have to deal with over and over and over again,

51:07 and Horizon just takes care of that.

51:08 Yeah, that's cool.

51:09 And what's it written in for the back end?

51:13 Is that like a custom Node.js server or something different?

51:16 Yeah, Horizon is all Node.js.

51:18 Yeah.

51:19 When you said you can take it and plug it into Node, I figured you guys must just be going,

51:24 here's the Node thing you've got to run, and if you want, you can put it into your app, right?

51:28 That's cool.

51:29 Yeah, that's exactly how it works.

51:30 Can you talk about the popularity of various back ends for Rethink TV or middle tier, I guess?

51:37 How frequent is it that people are working with it from, say, Python versus Node versus Ruby?

51:42 Do you know those numbers?

51:44 So I actually, I don't know these numbers off the top of my head.

51:47 My intuition is that most of the users, everything could be used in Node.js.

51:52 I think Python is the second biggest, and Ruby is very close.

51:57 Java is really popular in enterprise environments.

52:00 .NET is really popular in enterprise environments.

52:03 I think Node is still the biggest.

52:04 I honestly don't know the exact breakdown.

52:07 And it's kind of actually hard to figure out because the drivers are accessible through the package managers for their respective language.

52:15 And some of them have statistics, some of them don't.

52:18 Like, some of them measure statistics differently, so it's fairly hard to compare.

52:22 Yeah, apples, oranges, yeah.

52:25 Yeah.

52:25 Okay, interesting.

52:26 I noticed something that you guys have on your Horizon project that's in private beta.

52:33 Well, it's almost all that way because it's just so brand new, right?

52:37 But it is this thing called Horizon Cloud.

52:40 What's that?

52:40 Well, so Horizon Cloud is a way for people to deploy and manage and scale Horizon applications.

52:47 So the whole stack is open source, right?

52:49 Rethink TV is open source.

52:50 Horizon is open source.

52:51 Anyone can download them.

52:53 Anyone can build an app.

52:54 You can deploy the app any way you want, you know, on Linode or AWS or Azure or pretty much in any way that you want.

53:02 But deploying an app at scale is still pretty challenging.

53:08 And what we learned from our customers is very often, you know, they'll make these huge Rethink TV and Horizon deployments.

53:17 And they'll call us and, you know, they'll buy a support contract and we help them out with a lot of these deployments.

53:23 So in this process, we learned a lot about best practices.

53:27 We learned about the patterns of, you know, how to scale big applications, how to scale big Rethink TV clusters.

53:32 And Horizon Cloud takes, it's basically software as a service or platform as a service.

53:38 It takes all of that knowledge and operationalizes it in a service.

53:41 So the goal is if you want to deploy a massive Horizon or Rethink TV application, you can do it yourself because everything's open source or you could use Horizon Cloud and then we'll deal with all of the deployment and scaling and management issues.

53:55 So that's the goal of Horizon Cloud.

53:57 It basically makes, it takes away all the headaches of deploying and managing and scaling these applications.

54:03 Interesting.

54:03 So it sounds like it takes some of the consulting work that you might have been doing and turns it into like a framework or an offering that's automatic in some sense.

54:14 Yes.

54:15 Yeah, that's exactly what it does.

54:17 And the goal with Horizon Cloud is so you could deploy, Horizon Cloud right now is building the Google Compute Engine.

54:23 So we'll deploy everything to Google Compute.

54:26 But eventually Horizon Cloud will run on pretty much any backend cloud service.

54:31 So people will be able to pick.

54:33 And for enterprise customers who don't necessarily want to run on the cloud, they'll be able to run Horizon Cloud on their own internal infrastructure.

54:39 Okay.

54:40 Yeah, that sounds awesome.

54:41 So congratulations on the launch with Horizon because that really looks like something people were waiting for.

54:48 Yeah, we're very excited about it.

54:50 It was a lot of work and a different kind of work from building a database, but people seem to really like it.

54:55 Yeah, it's cool.

54:56 You know, you get really successful as the one thing like RethinkDB.

55:00 And you think, well, how are we going to make another thing that's equally successful, right?

55:05 It's really challenging but also interesting to create these new products that somewhat level up on each other but at the same time are new things.

55:14 So, yeah, nice.

55:15 Yeah, it's really exciting.

55:16 And you kind of get better at building products after a while.

55:18 So, I think building Horizon was certainly easier than building RethinkDB, but it was still pretty challenging.

55:24 Yeah.

55:26 It feels to me like whenever you build a product, an app that's going to ship somewhere major or something like that, you feel like you're almost done.

55:33 And then there's like a hundred little small things that you keep having to do.

55:38 And it just takes way longer.

55:40 So, when you finally do ship, it feels great.

55:42 Oh, yeah.

55:42 Yeah, it takes way longer even if you plan for it.

55:45 It still takes longer.

55:46 You tell yourself it's going to take twice as long and it still doesn't feel right that it takes a long.

55:50 Awesome.

55:51 So, we're just about out of time.

55:54 Let me ask you a couple of questions before we call it a show.

55:59 The question I always ask my guests when they're, and so we get to the end of the conversation is, when you write code, what editor do you use?

56:07 Emacs.

56:08 Always Emacs for me.

56:10 Emacs, right.

56:10 You have a big Lisp background, right?

56:13 Like you started out doing a lot of Lisp code.

56:15 Is that correct?

56:16 Yeah, I was really excited about Lisp for a long time and to a large degree I still am.

56:20 I still am, although we don't use it at RethinkDB.

56:23 We use a lot of ideas that come from Lisp, but not the language itself.

56:26 But, yeah, I was very excited about Lisp, Common Lisp, and I got into Emacs and Emacs Lisp.

56:31 So, that part of my background, like I still use Emacs every day.

56:35 Honestly, I don't think I'm ever going to switch to anything else.

56:38 Yeah, it's awesome.

56:39 That just sounds impossible.

56:40 It's a world you cannot imagine, right?

56:43 Mm-hmm.

56:43 Yeah, funny.

56:44 Okay, and then while you have everyone's attention, any final calls to action?

56:49 How do they get started with Rethink?

56:51 Who's maybe listening from the RethinkDB community?

56:53 You guys are amazing, and you make everything worthwhile, and you make the product better.

56:57 For anyone who hasn't used RethinkDB, I'd encourage you to go to RethinkDB.com.

57:02 You can download it in a couple of seconds.

57:04 You could get started very, very quickly.

57:05 And we're on Twitter at RethinkDB.

57:07 If you have any questions, we're always there to help out.

57:10 All right, fantastic.

57:11 So, this has been a really interesting look inside your company, building open source projects, a cool, fresh new database.

57:18 Thanks for taking the time to chat with us all.

57:20 Yeah, it's my pleasure.

57:21 Thank you for having me on the show.

57:22 Yep, bye-bye.

57:31 Are you or a colleague trying to learn or improve your Python?

58:00 Have you tried books and videos that just left you bored by covering topics point by point?

58:04 Check out my online course, Python Jumpstart by Building 10 Apps, at talkpython.fm/course to experience a more engaging way to learn Python.

58:13 And if you're looking for something a little more advanced, try my WritePythonic code course at talkpython.fm/pythonic, or recommend it to a friend or colleague.

58:22 You can find the links from this show at talkpython.fm/episode slash show slash 65.

58:28 And be sure to subscribe to the show.

58:30 Open your favorite podcatcher and search for Python.

58:32 We should be right at the top.

58:34 You can also find the iTunes feed at /itunes, Google Play feed at /play, and direct RSS feed at /rss on talkpython.fm.

58:42 Our theme music is Developers, Developers, Developers by Corey Smith, who goes by Smix.

58:46 You can hear the entire song at talkpython.fm/music.

58:49 This is your host, Michael Kennedy.

58:51 I really appreciate you all taking the time to listen, give feedback, suggestions on shows, and sharing it with your friends.

58:56 Smix, let's get out of here.

58:59 Outro Music.

59:20 Bye.