Sunday, June 23, 2013

ActiveRecord overdose

In the previous post, I highlighted the typical Rails bugs, that I encountered during my research on better ways of working on Rails apps. The bugs were often security-related:

The main common point of the bugs is the ActiveRecord overdose.

ActiveRecord overdose is a surprisingly popular pattern that includes:
  • putting everything into the model class
  • fear of adding new non-AR classes
  • over-relying on convention over configuration
When you look at a typical Rails model, it's a mix of the following responsibilities:
  • persistence 
  • business logic 
  • adapters 
  • view object 
  • form object 
  • others 

The Rails-way ActiveRecord is an enhanced version of the original Active Record pattern. The original pattern assumed only persistence and some simple business logic. The Rails version extended it with the view/form responsibilities. In many cases it's also extended with the adapters.
  • persistence 
    • associations 
    • queries - scopes 
    • saves 
    • simple validations 
  • business logic 
    • state-machine 
    • calculations 
    • tracking changes (touch) 
    • complex validations 
    • some callbacks 
    • feed/timeline/activity 
    • factories methods (class methods) 
  • adapters 
    • service (coordinating multiple adapters) 
    • search 
    • external api 
    • facebook 
    • twitter 
    • exception tracker 
    • metrics tracker 
    • image storage 
    • pdf 
    • mailer 
    • notification 
  • view object 
    • first_name + last_name 
    • foo.to_s 
    • image_path 
    • http boundary details (urls) 
    • i18n 
  • form object 
    • accepts_nested_attributes_for 
    • sanitizing 
    • attr_accessor 
  • others 
    • acts_as_foo (persistence, logic, validations) 
    • concerns 
    • includes 

The core responsibility of an ActiveRecord model is being a representation of a db record. That means handling the persistence and having the basic logic. This usually works great in typical CRUD apps. Once things get more complex, it's time to reconsider the active record pattern. 

The best way of escaping from the ActiveRecord overdose is to do it step by step. I'll focus on that in my later posts.

Monday, June 17, 2013

Typical Rails bugs

As part of my research on improving Rails application, I noticed a pattern in the bugs that are quite characteristic to Rails in several applications.

They have certain 'visible' things in common:

  • often security related, like leaking some information to unauthorized users
  • they live somewhere in the area of business logic/persistence
  • they are not so easy to fix in the existing codebase
  • they tend to exist in groups, similar bugs in different areas
  • they are easy to miss
  • often they appear during the requirements changes

Overdose of ActiveRecord

The reasons they exist is a combination of different things. The main common point of the bugs is the over-usage of ActiveRecord. It's easy to fall into this problem, I did it many times, as well.

You start with a new Rails app, adding new features very quickly. New requirements pop up. You use the popular Rails techniques to deal with them:

  • validations, which often turn into conditional validations
  • Single Table Inheritance
  • state-machine
  • accept_nested_attributes
  • models callbacks
  • virtual attributes
  • external gems that provide value magically, by using ActiveRecord (implicit dependencies)
Every technique is fine on its own. When you start using all of them in the same area, problems appear. 

Examples

I started noticing this trend, while reviewing the code of our potential Arkency customers. We sometimes do the "rescue missions" and help with legacy code. I can't use those examples here, so I started reviewing some of the popular open-source Rails applications: Discourse, Spree, Redmine, Gitlab. It didn't take me much time to discover the same bugs.

Just to be clear, all of the projects are awesome and I'm grateful to the people behind them. Most of the times, they deal with this kind of problems very quickly. I don't want to blame them for anything - they just serve here as examples.

Discourse

Let's start with Discourse:

1. Digest mail ignores secure groups



People receiving the digest mail can easily read posts not meant for them. That's because the digest mail ignores the secure groups a member has access to or not.
Quite a problem as I unfortunately found out.


2. Non-authenticated users see private topics in 404 page

http://meta.discourse.org/t/non-authenticated-users-see-private-topics-in-404-page-was-mobile-view/7419

Discourse correctly prevented me from seeing the topic and instead showed a 404 page. On that 404 page, the lists of "Latest Topics" and "Recent Topics" showed private topics. I could click those links, which resulted in seeing the same 404 page again.
The 404 page should not show private topics.


3. Auto-suggest topics shows private topics

http://meta.discourse.org/t/auto-suggest-topics-shows-private-topics/7418/3

We've got Discourse running with private categories. When a user without access to the private categories type a new topic, they are presented with topics in categories to which they don't have access.

Comment: Did you notice the pattern here?

Accidentally, I've had a small conversation at HN with one of the Discourse founders. He said:


Those private topic bugs are not the result of ActiveRecord. We added a group layer on top of existing code and missed some places where queries did not respect it.

Had we used raw SQL instead of an ORM we would have had the same issues. All projects are open to this style of bug. The correct thing to do is report, close them quickly and add tests to prevent them from happening again (which we do.)


I have completely no 'science' proof here, it's just based on my experience with hundreds of Rails projects. However, I disagree that all projects are open to this style of bug. 


It's very typical to Rails projects. 

The temptation of globally accessing every model/record from every place in the project makes it very easy to introduce such bugs. When you add new requirements (as they did) it's practically impossible to find all the places which should be changed.

Do I blame ActiveRecord here? No, ActiveRecord is a good library for persistence. It has its issues, it's huge, even the maintainers consider it a legacy code. Apart from some edge cases, it works more or less ok. I think the problem is relying on ActiveRecord for basically everything.

Spree

Let's see some other bugs, this time in Spree:


1. Fix wrong discount calculation with flat percent promotions



Fix wrong discount calculation with flat percent promotions when there are more than one line item in the order.

This error happened because the order instance here:
https://github.com/spree/spree/blob/master/core/app/models/spree/order_updater.rb#L24

is not always the same instance in memory here:
https://github.com/spree/spree/blob/master/core/app/models/spree/calculator/flat_percent_item_total.rb#L15

Adding the inverse option to the relationship makes sure you have the same object instance in both places.

Comment: Here we have another problem related to ActiveRecord. In more complex situations (cyclic associations), you can have the same 'record' in memory as two different instances, leading to errors.

2. Admin products loads entire dataset to determine the total count when paginating


When kaminary determines the total count of records it runs the following code
@collection.except(:offset, :limit, :order).count

The issue is that this sequence loads entire dataset to determine the count. This makes heavy load when products have large number of items.

The reason of this behaviour is group_by_products_id here https://github.com/spree/spree/blob/1-2-stable/core/app/controllers/spree/admin/products_controller.rb#L94

Comment: This time we have a combination of ActiveRecord and an external gem - kaminari (paginator). The end result is slowness of admin panels with thousands of products - they all will be loaded to memory. It's not the worst bug I've seen, but it's typical.

3. Default Tax Does Not Calculate Taxes Correctly if there is a Promotion


4. Taxes should be re-calculated after promotion adds an adjustment



Comment: the titles should be enough. This kind of bugs is usually related to the ActiveRecord callbacks usage. In complex scenarios it's not that easy to track all the places that need to be called after some changes.

Redmine

Let's now look at one Redmine bug:

1. Time entries of private issues are visible by users without permission to see them

Comment: Does it ring a bell? Again, a leak of private information.


Summary

We have seen some of the bugs, that I called typical Rails bugs. They seem to have certain things in common. I've seen them so many times, that they are the first things I look for, when I get access to a new project ticketing system. 

Is there any way of defending against them? I think so. The best step to start with is to be aware of such problems. Try to spot them in your projects, see what history is behind the bug, spread the knowledge in your team.

In my next blog posts, I'll focus on techniques that should help decrease the number of such bugs. Stay tuned. In the meantime, you may want to follow me on Twitter and subscribe to the Rails Architectures Guide, that I'm working on.