Looking through our exception tracker the other day, I ran across
a notice from our slow-query logger that caught my eye. I saw a
SELECT … WHERE … LIKE
query with lots of percent
signs in the LIKE
clause. It was pretty obvious that
this term was user-provided and my first thought was SQL
injection.
[3.92 sec] SELECT ... WHERE (profiles.email LIKE '%64%68%6f%6d%65%73@%67%6d%61%69%6c.%63%6f%6d%') LIMIT 10
Looking at the code, it turned out that we were using a
user-provided term directly in the LIKE
clause
without any checks for metacharacters that are interpreted in
this context (%
, _
, \
).
def self.search(term, options = {})
limit = (options[:limit] || 30).to_i
friends = options[:friends] || []
with_orgs = options[:with_orgs].nil? ? false : options[:with_orgs]
if term.to_s.index("@")
users = User.includes(:profile)
.where("profiles.email LIKE ?", "#{term}%")
.limit(limit).to_a
else
users = user_query(term, friends: friends, limit: limit)
end
...
end
While this isn’t full-blown SQL injection, it got me thinking about the impact of this kind of injection. This kind of pathological query clearly has some performance impact because we logged a slow query. The question is how much?
I asked our database experts and was told that it depends on
where the wildcard is in the query. With a %
in the
middle of a query, the database can still check the index for the
beginning characters of the term. With a %
at the
start of the query, indices may not get used at all. This bit of
insight led me to run several queries with varied %
placement against a test database.
mysql> SELECT 1 FROM `profiles` WHERE `email` LIKE "chris@github.com";
1 row in set (0.00 sec)
mysql> SELECT 1 FROM `profiles` WHERE `email` LIKE "%ris@github.com";
1 row in set (0.91 sec)
mysql> SELECT 1 FROM `profiles` WHERE `email` LIKE "chris@github%";
1 row in set (0.00 sec)
mysql> SELECT 1 FROM `profiles` WHERE `email` LIKE "%c%h%r%i%s%@%g%i%t%h%u%b%.%c%o%m%";
21 rows in set (0.93 sec)
It seems that unsanitized user-provided LIKE
clauses
do have a potential performance impact, but how do we address
this in a Ruby on Rails application? Searching the web, I
couldn’t find any great suggestions. There are no Rails helpers
for escaping LIKE
metacharacters, so we wrote some.
module LikeQuery
# Characters that have special meaning inside the `LIKE` clause of a query.
#
# `%` is a wildcard representing multiple characters.
# `_` is a wildcard representing one character.
# `\` is used to escape other metacharacters.
LIKE_METACHARACTER_REGEX = /([\\%_])/
# What to replace `LIKE` metacharacters with. We want to prepend a literal
# backslash to each metacharacter. Because String#gsub does its own round of
# interpolation on its second argument, we have to double escape backslashes
# in this String.
LIKE_METACHARACTER_ESCAPE = '\\\\\1'
# Public: Escape characters that have special meaning within the `LIKE` clause
# of a SQL query.
#
# value - The String value to be escaped.
#
# Returns a String.
def like_sanitize(value)
raise ArgumentError unless value.respond_to?(:gsub)
value.gsub(LIKE_METACHARACTER_REGEX, LIKE_METACHARACTER_ESCAPE)
end
extend self
module ActiveRecordHelper
# Query for values with the specified prefix.
#
# column - The column to query.
# prefix - The value prefix to query for.
#
# Returns an ActiveRecord::Relation
def with_prefix(column, prefix)
where("#{column} LIKE ?", "#{LikeQuery.like_sanitize(prefix)}%")
end
# Query for values with the specified suffix.
#
# column - The column to query.
# suffix - The value suffix to query for.
#
# Returns an ActiveRecord::Relation
def with_suffix(column, suffix)
where("#{column} LIKE ?", "%#{LikeQuery.like_sanitize(suffix)}")
end
# Query for values with the specified substring.
#
# column - The column to query.
# substring - The value substring to query for.
#
# Returns an ActiveRecord::Relation
def with_substring(column, substring)
where("#{column} LIKE ?", "%#{LikeQuery.like_sanitize(substring)}%")
end
end
end
ActiveRecord::Base.extend LikeQuery
ActiveRecord::Base.extend LikeQuery::ActiveRecordHelper
We then went through and audited all of our LIKE
queries, fixing eleven such cases. The risk of these queries
turned out to be relatively low. A user could subvert the
intention of the query, though not in any meaningful way. For us,
this was simply a Denial of Service (DoS) vector. It’s nothing
revolutionary and it is not a new vulnerability class,
but it’s something to keep an eye out for. Three second queries
can be a significant performance hit and application-level DoS
vulnerabilities need to be mitigated.
Update: A number of people pointed out that in
Rails 4.2, ActiveRecord includes a sanitize_sql_like
helper for
sanitizing LIKE clauses.