RubyKaigi ’18 was held in Sendai from May 31st to June 2nd. Sider’s team had a privilege to interview with Bozhidar Batsov, the very person who created RuboCop, a code style analyzer for Ruby, as he joined this year’s RubyKaigi as a speaker. Our CTO Soutaro Matsumoto asked Bozhidar what you may also want to know — Bozhidar’s thoughts on RuboCop and the style guide.
From Ruby Style Guide to RuboCop
Our first question is about your motivation behind developing Rubocop. What is the problem you are trying to solve with RuboCop?
So I used to be a Java developer before I became a Ruby developer. And although Java has many problems, there were a lot of good lint tools for Java. And in the company that I worked for before, we were working on many projects. It was a consulting company. And people had to move between projects so that it was very important for us back then to have a unified coding style for all of our projects. Otherwise, it was really confusing when you do the switch. “Okay, why are we doing it one way here, another way here…?” and so on.
When I switched professionally to Ruby, the first problem that I had to solve was to come up with some unified coding conventions because this was not a problem for Java. There has been an official coding style guideline by Sun and the one that is unofficial but extremely popular by Google. So, everybody was using one or the other. And the code bases were extremely consistent; it was very easy to move around them. And it is fun because I rarely had those conversations like “Are we doing tabs or spaces? Are we putting the braces here or there?”
Then I started this new job as a Ruby developer in a company. I was the only person who actually knew Ruby. We wanted to build numerous projects, but everybody had some different background because Ruby was very new. This was the second company in Bulgaria to be doing Ruby professionally. It was over 10 years ago. So the people on the team were good developers, but they were PHP developers, Python developers, and Java developers.
And half the code reviews were just pointing out basic stylistic problems to them.
So first, I wrote for the employees of the company a Ruby style guide by analyzing several famous books, Matz’s book, the Pickaxe book and so on. Everything that was considered good — I put it together. There were times I had to make some judgment calls because one book would be using one style and another would be suggesting something else. So, I would go with the code of popular projects. I disregarded the coding conventions of Rails because they were very unorthodox and even Rails developers don’t code like Rails is coded internally.
Then I came up with the initial version of the Ruby style guide, and I decided to publish it publicly. It became kind of popular. It gained a lot of traction. We received a lot of feedback. And one of the tickets had: “this is really nice but it was so big. I could not possibly remember all those rules. It would have been really nice if, like Python, we had the style guide document PEP 8 and we would also have the official tool that enforces this.” I told myself okay, I’ll take a step at this. I started to work on RuboCop and now — this was the motivation.
So, the style guide is too long and you needed some automatic checking tool.
Yes. I was actually surprised by the fact there wasn’t such guides and linters in the Ruby community at that time. It’s not like I wanted to work on any of this, I had to do it. There were a couple of incomplete style guides but all the linters were extremely primitive.
I don’t know how well we will remember the situation with the transition between 1.8 and 1.9. But, almost all of the Lint tools were using a parser by the Seattle.rb user group, which was incompatible with Ruby 1.9. And this is why earlier versions of RuboCop were using Ripper, even though Ripper is summarized specifically because absolutely nothing else worked with 1.9. So it was painful. I think that the transition between 1.8 and 1.9 probably killed whatever inertia there was in the original lint tooling for Ruby.
The Community made Rubocop Special
Today, RuboCop is really popular. It’s one of the most widely used lint tools for Ruby. Was there any milestone that made RuboCop so popular?
I guess I’m just very charming and people trust me (laughs). Jokes aside, I think that there was simply a vacuum and whoever built something like this first was going to have a popular tool on their hands because people clearly needed it. Back in the day it wasn’t so clear, but it’s clear now. Many people were enthusiastic about the first few releases which had very little functionality. And people had been putting up with constant breakages between each version because every time we add new stuff, somebody’s build is broken. Probably you’re very familiar with this. So there was a vacuum that somebody had to fill; it was RuboCop. If it wasn’t, there would have been another lint tool that came next. But I don’t think we made anything special. If there was something special, it was that I was very receptive to feedback from the community.
I didn’t really intend for RuboCop to be so flexible. I wanted something that enforces the Ruby style guide. I didn’t anticipate many of the use cases people would have in the beginning. It never crossed my mind that we would be having a pluggable architecture, that we would have inheritable configurations. But the first version didn’t even have configurations because this was for the style guide. It was supposed to enforce the style guide.
So I think working with the community might have been our secret juice. And the fact that many people came and helped (made things possible) because nobody can prove something like this by themselves. At least 10 people contributed hugely to the success of RuboCop.
And that is really important. Without the team, without the community, you can never succeed in anything.
Bozhidar’s own configuration of RuboCop — Consistency wins over your personal preference
Thank you very much. So let’s go to next question. Maybe you are using RuboCop in your team, right?
Bozhidar Batsov: Of course.
How is the configuration in the team of RuboCop? Is it like the default?
No, it’s not the default, it’s very different.
Even though I’m the VP of engineering at Toptal and I can tell them to use the default, I’m a very democratic executive. I left it up to my team to vote on this, and there have been so many struggles.
We have this process in our company that the Ruby developers hold the “style guide update meeting” every six months, where we diverge from the official style guide for Ruby and Rails. In the beginning, when the team was smaller, I was going to the meeting and trying to influence them. But the conversations were so heated that I decided that probably I don’t want to interfere. In the meeting, We would submit an agenda, which rules we want to revisit, and who would be doing the voting. Some of the developers were so passionate that they would try to cheat the system, creating fake accounts with which to vote and so on. I remember that on one call, there were 40 of us. And for one rule to be changed, we received 60 votes, so this was kind of surprising (laughs).
So yeah, many people also think that the style guide expresses my personal preferences;
I can assure you this is very far from the truth. There are many things that I strongly disagree with personally, but people believe that it’s a good idea and they use it widely in the Ruby community. So, that’s fine with me.
Before I came, I actually read the last couple of interviews on your blog with Matz and with Matsuda-san. It was interesting that Matz mentioned that he disagrees with many of the points in the Ruby style guide as well. I would really want to discuss which exactly are those. I think that it’s impossible to create something that pleases everybody and you always have to make certain compromises. At the end of the day, it’s much more important to just agree on something and follow it consistently than what exactly you personally agree on.
Consistency is style. Subjectively, there is better style and worse style. But it’s much more important to have some notion of style, to be working in this consistent fashion. I know that is not what everybody agrees with. Some people think that if the interpreter accepts the code, it’s good enough and we should care just about how the code behaves. But, when you see everyday that people’s contributions are impeded by this, then their onboarding process and so on, it becomes clear that that’s not some imaginary problem. It’s a very real problem. Ruby’s internal code basis is so messy that I think that this makes it harder for newcomers to contribute. They spent a lot of time wondering, “Am I supposed to do it like in this file or like in another file? What exactly am I supposed to be doing?” If you eliminate this uncertainty, people just focus on what they’re supposed to be doing and magic happens.
A Roadmap of RuboCop 1.0 and the Goal of the new Organization
Pocke, the technical advisor of Sider and a dedicated RuboCop committer was also at this interview. He asked a couple of questions regarding the RuboCop 1.0 to Bozhidar.
Yesterday(at RubyKaigi ’18 session), you talked about RuboCop 1.0 but I didn’t get to attend the entire session. Can you tell us the roadmap of RuboCop 1.0 in more details?
While the roadmap is not actually set in stone, the biggest milestones for 1.0 is we extract the Rails functionality to a separate gem and call it
rubocop-rails. Probably, we will extract the performance functionality as well into a separate gem because it’s very version dependent for MRI. I’m wondering whether this was a very good idea or not. From one end, it’s accurate and it’s very valuable but once it starts misleading people, it’s not very valuable. Maybe for the performance cops, which would also add some ranges, each would be applied to this MRI version and this, but not on others. We’ll see about this. We have to reduce the scope of RuboCop and we have to make it more modular.
We have to make a better API in the process for having extensions because now there is no formal API and that’s a big problem. Everybody who is having an extension gem is just guessing, “this is maybe public, this is maybe private.” And it’s bad. So we have to publish some nice API, commit to it, we are going to maintain it, we are not going to break it very often.
The other thing for 1.0 would be to find a mechanism for updates without breakages. I mentioned at RubyKaigi that my simple idea is to introduce an extra status for cops, right now we have enabled and disabled. So we just put a new status, we call it “new.” And when you start a new version RuboCop and you have cops of which status are “new”, you get a warning. It says “you have 10 cops that are with status new.” You then should decide whether to enable or disable them, and these warnings disappear. I think that this is a very simple feature, but it’s going to make upgrades so much more painless. And we need to define what exactly is a breaking change. For instance, is changing the default of some cop a breaking change or a non-breaking change? Probably it’s a breaking change. We will change those things, probably just in major versions.
I was also hoping that for 1.0, we are going to extract the node matcher engine from RuboCop and make it pluggable. There were some people from this company Stripe, who told me that they had an idea about matcher implemented in C++, which would be ridiculously fast. They profiled RuboCop, and they felt that if we switch to such a node matcher, RuboCop would be 10 times faster. We have to verify whether this is true or false but they seemed like people who knew what they are talking about. So it is probably true. That’s not that important but it is somewhat important.
And just to ensure future-overall-consistency, I want us to carefully review the naming and the configuration for the cops that are in existence. The naming should be consistent across the entire code base so we don’t just rename stuff that is on the road. We should make sure that the defaults are actually useful to the people and people are not changing them.
I have been thinking that we could leverage either GitHub’s BigQuery API to check some stuff in Ruby projects overall or the gem that we showcased yesterday that could search through all the gems on RubyGems. Something like this will be pretty cool because, honestly, I’m not quite certain whether all the defaults are good. I know they make some sense to me but I’m obviously quite biased.
I think that those are the key points. I don’t want us to have some huge roadmap for 1.0. I want us to have something that we can actually reach in a few months. In practical terms, I think that only the separation of the Rails and the performance stuff is going to be more painful. And even this is not super painful; as we already have gems like rubocop-rspec so we can do something like what they are doing.
It would be nice to improve some other things, like have a smarter generation of the initial configuration like your project; maybe this can replace the current version over it. I hope that we wouldn’t need to migrate configurations between minor versions. If we have to, we would migrate at least just between major versions given that we commit to more stable development cycles. And I think that that’s pretty much the gist of 1.0.
Thank you. I also have another question. Yesterday, you moved RuboCop repository from under bbatosv to rubocop-hq. I think you had a few reasons for this organization (rubocop-hq) such as moving rubocop-jp/issues. Can you tell us other plans for the organization?
My goal for this organization is to become the central hub for all the core RuboCop related activities. I want us to build a bigger core team. Hopefully, I want us to have bigger synchronization between the different projects because, right now, I have almost no idea who is writing extensions for RuboCop. I am working somewhat with rubocop-rspec people because they are building the biggest, most complex extension. But when I search on Ruby gems for RuboCop and I saw 189 gems, I was really surprised. This is actually how I found your Mry and Gry gems when I was going through this list. And we are working together but I still didn’t know you built them. I also know that there are some older gems that were pretty useful but are not maintained much anymore.
And it would be nice if we have some central resources. For instance, people often ping on the guard-rubocop repository for updates. I know that Yuji (*Yuji Nakayama) is just busy and he doesn’t have time for this, but maybe he can transfer this and we can maintain it in rubocop-hq. I also want people to see that RuboCop belongs to the entire Ruby community and the same applies to style guides. It’s not some personal project of mine anymore. It hasn’t been a personal project of mine for quite a while. If so many people are using it, if so many things are needed to be done, this belongs to the community.
Possibilities of Sider
I hope you know of our service Sider. Could you give us some comments or your opinion for Sider or maybe similar services that make code review on how to make some improvements in the code reviewing by using RuboCop?
I think when it comes to similar services, you should be asking what RuboCop can do more for the service. I think that the value of services like Sider is undeniable, but you need the underlying tools to help the end users.
There has been a lot of overlapping work in the beginning. Everybody is doing their own thing when it comes to analysis. And I have been happy in the past few years that people agree that we are going to leverage this library internally, this and that — whatever. So they are focusing on the end user experience and they don’t have to rediscover the wheel every time.
In my company, we are using a lot of similar tools. For RuboCop, we actually have our own integration with GitHub because we have a huge code base that we cannot migrate as easy as I wish we could. Actually, I wasn’t aware of Sider before we started working together (*Pocke started contributing to RuboCop as a committer).
In a way, I hope that we are going to get to a point where RuboCop can provide a good quality CI with everything they need. And for you, it will just become a matter of how to present this to the users. I was thinking that it would be really nice for something like Sider if RuboCop has support for profiles.
If you don’t like the default style and you don’t want to have custom stuff, you say, “I want to run the GitHub profile,” or the Stripe profile or whatever. I think that this wouldn’t be able to fork, probably it’s a couple of hours of work. And for the end users, this is going to be really great because you give them some drop down in the settings and magic happens. So many people can be complaining to me that it’s very hard for them to get started because they have to tweak the configuration. But if you give them a ton of configurations by default, that’s great. And I see this as a responsibility mostly of the integrators like you, not that much of RuboCop, that RuboCop adds some support for loading profiles, or that you supply as many profiles as you want.
And also it would be really nice if somebody is using hierarchy or overrides of configuration, that the CI/UI could show them this. Because people often put different
.rubocop.ymls in different folders, they forget about this and then they wonder why in some folders something is producing one kind of offense yet in another a different. The file bugs and they will say, “how many
.rubocop.yml s do you have?”
It’s the same with inclusions and exclusions. When users do something wrong, it would be really nice if something presented them a nice UI, “Are you sure you really mean those inclusions?.” Because they seem kind of strange. Or those exclusions. Recently, there was a ticket, “RuboCop stopped processing my
.rb files.” Well, that is because we don’t have
.rb in your file includes. Maybe RuboCop should be warning about this, but probably it shouldn’t because that’s just some configuration and user made a configuration error.
But Sider can be helping them.
This is how I see the relationship, you tell us what you need to make the users happier, we build the foundation together and you present it to them in a way that is going to empower them, make them happiest, most productive in the true spirit of Ruby.
Advice for New RuboCop Users
Could you give me some comments or suggestions for the RuboCop users and to someone planning to start using RuboCop?
Read the manual (everyone laughs.) A lot of questions that I get and a lot of bug reports that we received in general are covered in the manual but nobody reads the manual.
I also want to see more people just reading the description of cops and seeing what’s configurable because I very often see that cops which are very configurable are just disabled and they simply don’t understand it. It is really problematic that some of the users either use this style or another style but tell me that you don’t care about the consistency of something.
If people put some description when they disable some cops in their personal or company
.rubocop.ymls, it would be really helpful because I and the other maintainers sometimes just search GitHub for those, to get some insight about what to change. But if there are no descriptions to figure out, we won’t know if they disabled something because it was bugging, they did not report bugs, or they didn’t understand something and they needed some help.
I think that although our documentation is far from perfect, it’s not that bad. And if people invested a little bit of time, it would be very helpful.
And the other thing I can suggest to more people is if your project is not very big, just jump straight in. Fix all the offenses at once, and don’t invest in some complex setups that will just show you offenses on these whatever, something like Sider can help you in this case. But I think that for the team itself, it’s better to just clean up the code base at once. I think that any code base under 100,000 of lines can be fixed in a day or two if people focused on it. If you’re unfortunate as us at Toptal with a lot of Ruby projects, a lot of legacies, it takes a little bit more time. But even with that, we eventually cleaned out everything. But you cannot be cutting around different code styles forever. At some point, you have to draw the line and say I’m done, I’m away and so, so. That would be my final advice for them.
Thank you very much. As a final question, we also want to ask for another comment specifically for Japanese users.
Well. I love the support that I have been getting from the Japanese users, they have been some of the most active committers and I want to see more of them more often, the organization, the Japanese project and such. Japan is the heart of the Ruby community and it can be the heart of the RuboCop community as well.
People should know that they are very welcome to help us in any way. I know that many people here seem to be a bit shy here in Japan but we are a very friendly bunch.
As Genadi mentioned, Bulgarians are living in this Slav paradise, we are happy and welcoming people there.
Thank you very much for your warm comment again. Thank you very much.
It was a pleasure, thank you.
After the interview, we had a dinner with Koichi Ito, another RuboCop committer, and some RubyKaigi speakers. Talking about RuboCop with great Sake, a wonderful night!
Thank you, Bozhidar. Sider is looking forward to seeing you in Japan again!
For more information about Sider, please go to our website.