Skip to content

Commit

Permalink
Add SASLprep. Code generated & tested with RFC3454
Browse files Browse the repository at this point in the history
RFC-4422 recommends that mechanisms SHOULD prepare simple usernames and
passwords with SASLprep.  And SASLprep is required by the `SCRAM-*`
mechanisms, which will be added in a future PR.

SASLprep is also recommended for the `PLAIN` SASL mechanism and for the
`ACL` IMAP extension but—in both cases—string preparation is done by the
*server*, at its own discretion.

SASLprep has been officially obsoleted by PRECIS.  I don't believe any
IMAP RFCs have allowed replacing SASLprep with PRECIS yet.  In contrast,
RFC-7622 updates XMPP to require PRECIS.  (See RFC-8265 for more info on
PRECIS, and Section 6 for migration considerations.)

Rather than create a fully generic StringPrep superclass or function,
the SASLprep profile is optimized.  Just enough of the generic
StringPrep algorithm has been implemented to provide more detailed
errors for prohibited strings.  Future PRs can expand it as needed, to
implement other profiles.  In particular, the `trace` StringPrep profile
is a requirement for clients using the `ANONYMOUS` mechanism.

Many other StringPrep implementations store the tables as an array of
ranges and loop over every character in the input.  But using Regexp is
simpler and much faster, especially where the tables closely match
Unicode character classes (benchmarks are included).  Some StringPrep
tables use Regexps that are generated from the RFC-3454 appendices.
Manually written regular expressions are used in cases where there is a
close match between Unicode character classes and the SASLprep tables.
All regexps are tested against the RFC tables with every valid
codepoint, to verify they aren't broken if their character classes are
changed by new versions of Unicode.

Additionally:

* Added `rake rfcs` to download many IMAP-related RFCs, for convenience.
* The new code is namespaced under `Net::IMAP::SASL`.  We could move the
  authenticators there too.  If SASL funcionality is ever extracted to
  another gem, we can use: `Net::IMAP::SASL = Net::SASL` for backward
  compatibility.
  • Loading branch information
nevans committed Sep 30, 2022
1 parent 174e35c commit 2903b9c
Show file tree
Hide file tree
Showing 15 changed files with 1,254 additions and 0 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
/coverage/
/doc/
/pkg/
/rfcs
/spec/reports/
/tmp/
/Gemfile.lock
3 changes: 3 additions & 0 deletions Rakefile
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
# frozen_string_literal: true

require "bundler/gem_tasks"
require "rake/testtask"
require "rake/clean"

Rake::TestTask.new(:test) do |t|
t.libs << "test/lib"
Expand Down
65 changes: 65 additions & 0 deletions benchmarks/stringprep.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
---
prelude: |
begin
require "mongo" # gem install mongo
require "idn" # gem install idn-ruby
rescue LoadError
warn "You must 'gem install mongo idn-ruby' for this benchmark."
raise
end
MStrPrep = Mongo::Auth::StringPrep
# this indirection will slow it down a little bit
def mongo_saslprep(string)
MStrPrep.prepare(string,
MStrPrep::Profiles::SASL::MAPPINGS,
MStrPrep::Profiles::SASL::PROHIBITED,
normalize: true,
bidi: true)
rescue Mongo::Error::FailedStringPrepValidation
nil
end
$LOAD_PATH.unshift "./lib"
require "net/imap"
def net_imap_saslprep(string)
Net::IMAP::SASL::SASLprep.saslprep string, exception: false
end
def libidn_saslprep(string)
IDN::Stringprep.with_profile(string, "SASLprep")
rescue IDN::Stringprep::StringprepError
nil
end
benchmark:
- net_imap_saslprep "I\u00ADX" # RFC example 1. IX
- net_imap_saslprep "user" # RFC example 2. user
- net_imap_saslprep "USER" # RFC example 3. user
- net_imap_saslprep "\u00aa" # RFC example 4. a
- net_imap_saslprep "\u2168" # RFC example 5. IX
- net_imap_saslprep "\u0007" # RFC example 6. Error - prohibited character
- net_imap_saslprep "\u0627\u0031" # RFC example 7. Error - bidirectional check
- net_imap_saslprep "I\u2000X" # map to space: I X
- net_imap_saslprep "a longer string, e.g. a password"

- libidn_saslprep "I\u00ADX" # RFC example 1. IX
- libidn_saslprep "user" # RFC example 2. user
- libidn_saslprep "USER" # RFC example 3. user
- libidn_saslprep "\u00aa" # RFC example 4. a
- libidn_saslprep "\u2168" # RFC example 5. IX
- libidn_saslprep "\u0007" # RFC example 6. Error - prohibited character
- libidn_saslprep "\u0627\u0031" # RFC example 7. Error - bidirectional check
- libidn_saslprep "I\u2000X" # map to space: I X
- libidn_saslprep "a longer string, e.g. a password"

- mongo_saslprep "I\u00ADX" # RFC example 1. IX
- mongo_saslprep "user" # RFC example 2. user
- mongo_saslprep "USER" # RFC example 3. user
- mongo_saslprep "\u00aa" # RFC example 4. a
- mongo_saslprep "\u2168" # RFC example 5. IX
- mongo_saslprep "\u0007" # RFC example 6. Error - prohibited character
- mongo_saslprep "\u0627\u0031" # RFC example 7. Error - bidirectional check
- mongo_saslprep "I\u2000X" # map to space: I X
- mongo_saslprep "a longer string, e.g. a password"
39 changes: 39 additions & 0 deletions benchmarks/table-regexps.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
prelude: |
require "json"
require "set"
all_codepoints = (0..0x10ffff).map{_1.chr("UTF-8") rescue nil}.compact
rfc3454_tables = Dir["rfcs/rfc3454*.json"]
.first
.then{File.read _1}
.then{JSON.parse _1}
titles = rfc3454_tables.delete("titles")
sets = rfc3454_tables
.transform_values{|t|t.keys rescue t}
.transform_values{|table|
table
.map{_1.split(?-).map{|i|Integer i, 16}}
.flat_map{_2 ? (_1.._2).to_a : _1}
.to_set
}
TABLE_A1_SET = sets.fetch "A.1"
ASSIGNED_3_2 = /\p{AGE=3.2}/
UNASSIGNED_3_2 = /\P{AGE=3.2}/
TABLE_A1_REGEX = /(?-mix:[\u{0000}-\u{001f}\u{007f}-\u{00a0}\u{0340}-\u{0341}\u{06dd}\u{070f}\u{1680}\u{180e}\u{2000}-\u{200f}\u{2028}-\u{202f}\u{205f}-\u{2063}\u{206a}-\u{206f}\u{2ff0}-\u{2ffb}\u{3000}\u{e000}-\u{f8ff}\u{fdd0}-\u{fdef}\u{feff}\u{fff9}-\u{ffff}\u{1d173}-\u{1d17a}\u{1fffe}-\u{1ffff}\u{2fffe}-\u{2ffff}\u{3fffe}-\u{3ffff}\u{4fffe}-\u{4ffff}\u{5fffe}-\u{5ffff}\u{6fffe}-\u{6ffff}\u{7fffe}-\u{7ffff}\u{8fffe}-\u{8ffff}\u{9fffe}-\u{9ffff}\u{afffe}-\u{affff}\u{bfffe}-\u{bffff}\u{cfffe}-\u{cffff}\u{dfffe}-\u{dffff}\u{e0001}\u{e0020}-\u{e007f}\u{efffe}-\u{10ffff}])|(?-mix:\p{Cs})/.freeze
benchmark:

# matches A.1
- script: "all_codepoints.grep(TABLE_A1_SET)"
- script: "all_codepoints.grep(TABLE_A1_REGEX)"
- script: "all_codepoints.grep(UNASSIGNED_3_2)"
- script: "all_codepoints.grep_v(ASSIGNED_3_2)"

# doesn't match A.1
- script: "all_codepoints.grep_v(TABLE_A1_SET)"
- script: "all_codepoints.grep_v(TABLE_A1_REGEX)"
- script: "all_codepoints.grep_v(UNASSIGNED_3_2)"
- script: "all_codepoints.grep(ASSIGNED_3_2)"
1 change: 1 addition & 0 deletions lib/net/imap.rb
Original file line number Diff line number Diff line change
Expand Up @@ -1474,3 +1474,4 @@ def start_tls_session(params = {})
require_relative "imap/response_data"
require_relative "imap/response_parser"
require_relative "imap/authenticators"
require_relative "imap/sasl"
74 changes: 74 additions & 0 deletions lib/net/imap/sasl.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
# frozen_string_literal: true

module Net
class IMAP

# Pluggable authentication mechanisms for protocols which support SASL
# (Simple Authentication and Security Layer), such as IMAP4, SMTP, LDAP, and
# XMPP. {RFC-4422}[https://tools.ietf.org/html/rfc4422] specifies the
# common SASL framework and the +EXTERNAL+ mechanism, and the
# {SASL mechanism registry}[https://www.iana.org/assignments/sasl-mechanisms/sasl-mechanisms.xhtml]
# lists the specification for others.
#
# "SASL is conceptually a framework that provides an abstraction layer
# between protocols and mechanisms as illustrated in the following diagram."
#
# SMTP LDAP XMPP Other protocols ...
# \ | | /
# \ | | /
# SASL abstraction layer
# / | | \
# / | | \
# EXTERNAL GSSAPI PLAIN Other mechanisms ...
#
module SASL

# autoloading to avoid loading all of the regexps when they aren't used.

autoload :StringPrep, File.expand_path("sasl/stringprep", __dir__)
autoload :SASLprep, File.expand_path("#{__dir__}/sasl/saslprep", __dir__)

# ArgumentError raised when +string+ is invalid for the stringprep
# +profile+.
class StringPrepError < ArgumentError
attr_reader :string, :profile

def initialize(*args, string: nil, profile: nil)
@string = -string.to_str unless string.nil?
@profile = -profile.to_str unless profile.nil?
super(*args)
end
end

# StringPrepError raised when +string+ contains a codepoint prohibited by
# +table+.
class ProhibitedCodepoint < StringPrepError
attr_reader :table

def initialize(table, *args, **kwargs)
@table = -table.to_str
details = (title = StringPrep::TABLE_TITLES[table]) ?
"%s [%s]" % [title, table] : table
message = "String contains a prohibited codepoint: %s" % [details]
super(message, *args, **kwargs)
end
end

# StringPrepError raised when +string+ contains bidirectional characters
# which violate the StringPrep requirements.
class BidiStringError < StringPrepError
end

module_function

# See SASLprep#saslprep.
def saslprep(string, **opts)
SASLprep.saslprep(string, **opts)
end

end
end

end

Net::IMAP.extend Net::IMAP::SASL
55 changes: 55 additions & 0 deletions lib/net/imap/sasl/saslprep.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
# frozen_string_literal: true

require_relative "saslprep_tables"

module Net::IMAP::SASL

# SASLprep#saslprep can be used to prepare a string according to [RFC4013].
#
# \SASLprep maps characters three ways: to nothing, to space, and Unicode
# normalization form KC. \SASLprep prohibits codepoints from nearly all
# standard StringPrep tables (RFC3454, Appendix "C"), and uses \StringPrep's
# standard bidirectional characters requirements (Appendix "D"). \SASLprep
# also uses \StringPrep's definition of "Unassigned" codepoints (Appendix "A").
module SASLprep

# Used to short-circuit strings that don't need preparation.
ASCII_NO_CTRLS = /\A[\x20-\x7e]*\z/u.freeze

module_function

# Prepares a UTF-8 +string+ for comparison, using the \SASLprep profile
# RFC4013 of the StringPrep algorithm RFC3454.
#
# By default, prohibited strings will return +nil+. When +exception+ is
# +true+, a StringPrepError describing the violation will be raised.
#
# When +stored+ is +true+, "unassigned" codepoints will be prohibited. For
# \StringPrep and the \SASLprep profile, "unassigned" refers to Unicode 3.2,
# and not later versions. See RFC3454 §7 for more information.
#
def saslprep(str, stored: false, exception: false)
return str if ASCII_NO_CTRLS.match?(str) # raises on incompatible encoding
str = str.encode("UTF-8") # also dups (and raises for invalid encoding)
str.gsub!(MAP_TO_SPACE, " ")
str.gsub!(MAP_TO_NOTHING, "")
str.unicode_normalize!(:nfkc)
# These regexps combine the prohibited and bidirectional checks
return str unless str.match?(stored ? PROHIBITED_STORED : PROHIBITED)
return nil unless exception
# raise helpful errors to indicate *why* it failed:
tables = stored ? TABLES_PROHIBITED_STORED : TABLES_PROHIBITED
StringPrep.check_prohibited! str, *tables, bidi: true, profile: "SASLprep"
raise StringPrep::InvalidStringError.new(
"unknown error", string: string, profile: "SASLprep"
)
rescue ArgumentError, Encoding::CompatibilityError => ex
if /invalid byte sequence|incompatible encoding/.match? ex.message
return nil unless exception
raise StringPrepError.new(ex.message, string: str, profile: "saslprep")
end
raise ex
end

end
end
Loading

0 comments on commit 2903b9c

Please sign in to comment.