Scientist
:microscope: A Ruby library for carefully refactoring critical paths.
Install / Use
/learn @github/ScientistREADME
Scientist!
A Ruby library for carefully refactoring critical paths.
How do I science?
Let's pretend you're changing the way you handle permissions in a large web app. Tests can help guide your refactoring, but you really want to compare the current and refactored behaviors under load.
require "scientist"
class MyWidget
def allows?(user)
experiment = Scientist::Default.new "widget-permissions"
experiment.use { model.check_user(user).valid? } # old way
experiment.try { user.can?(:read, model) } # new way
experiment.run
end
end
Wrap a use block around the code's original behavior, and wrap try around the new behavior. experiment.run will always return whatever the use block returns, but it does a bunch of stuff behind the scenes:
- It decides whether or not to run the
tryblock, - Randomizes the order in which
useandtryblocks are run, - Measures the wall time and cpu time of all behaviors in seconds,
- Compares the result of
tryto the result ofuse, - Swallow and record exceptions raised in the
tryblock when overridingraised, and - Publishes all this information.
The use block is called the control. The try block is called the candidate.
Creating an experiment is wordy, but when you include the Scientist module, the science helper will instantiate an experiment and call run for you:
require "scientist"
class MyWidget
include Scientist
def allows?(user)
science "widget-permissions" do |experiment|
experiment.use { model.check_user(user).valid? } # old way
experiment.try { user.can?(:read, model) } # new way
end # returns the control value
end
end
If you don't declare any try blocks, none of the Scientist machinery is invoked and the control value is always returned.
Making science useful
The examples above will run, but they're not really doing anything. The try blocks don't run yet and none of the results get published. Replace the default experiment implementation to control execution and reporting:
require "scientist/experiment"
class MyExperiment
include Scientist::Experiment
attr_accessor :name
def initialize(name)
@name = name
end
def enabled?
# see "Ramping up experiments" below
true
end
def raised(operation, error)
# see "In a Scientist callback" below
p "Operation '#{operation}' failed with error '#{error.inspect}'"
super # will re-raise
end
def publish(result)
# see "Publishing results" below
p result
end
end
When Scientist::Experiment is included in a class, it automatically sets it as the default implementation via Scientist::Experiment.set_default. This set_default call is skipped if you include Scientist::Experiment in a module.
Now calls to the science helper will load instances of MyExperiment.
Controlling comparison
Scientist compares control and candidate values using ==. To override this behavior, use compare to define how to compare observed values instead:
class MyWidget
include Scientist
def users
science "users" do |e|
e.use { User.all } # returns User instances
e.try { UserService.list } # returns UserService::User instances
e.compare do |control, candidate|
control.map(&:login) == candidate.map(&:login)
end
end
end
end
If either the control block or candidate block raises an error, Scientist compares the two observations' classes and messages using ==. To override this behavior, use compare_errors to define how to compare observed errors instead:
class MyWidget
include Scientist
def slug_from_login(login)
science "slug_from_login" do |e|
e.use { User.slug_from_login login } # returns String instance or ArgumentError
e.try { UserService.slug_from_login login } # returns String instance or ArgumentError
compare_error_message_and_class = -> (control, candidate) do
control.class == candidate.class &&
control.message == candidate.message
end
compare_argument_errors = -> (control, candidate) do
control.class == ArgumentError &&
candidate.class == ArgumentError &&
control.message.start_with?("Input has invalid characters") &&
candidate.message.start_with?("Invalid characters in input")
end
e.compare_errors do |control, candidate|
compare_error_message_and_class.call(control, candidate) ||
compare_argument_errors.call(control, candidate)
end
end
end
end
Adding context
Results aren't very useful without some way to identify them. Use the context method to add to or retrieve the context for an experiment:
science "widget-permissions" do |e|
e.context :user => user
e.use { model.check_user(user).valid? }
e.try { user.can?(:read, model) }
end
context takes a Symbol-keyed Hash of extra data. The data is available in Experiment#publish via the context method. If you're using the science helper a lot in a class, you can provide a default context:
class MyWidget
include Scientist
def allows?(user)
science "widget-permissions" do |e|
e.context :user => user
e.use { model.check_user(user).valid? }
e.try { user.can?(:read, model) }
end
end
def destroy
science "widget-destruction" do |e|
e.use { old_scary_destroy }
e.try { new_safe_destroy }
end
end
def default_scientist_context
{ :widget => self }
end
end
The widget-permissions and widget-destruction experiments will both have a :widget key in their contexts.
Expensive setup
If an experiment requires expensive setup that should only occur when the experiment is going to be run, define it with the before_run method:
# Code under test modifies this in-place. We want to copy it for the
# candidate code, but only when needed:
value_for_original_code = big_object
value_for_new_code = nil
science "expensive-but-worthwhile" do |e|
e.before_run do
value_for_new_code = big_object.deep_copy
end
e.use { original_code(value_for_original_code) }
e.try { new_code(value_for_new_code) }
end
Keeping it clean
Sometimes you don't want to store the full value for later analysis. For example, an experiment may return User instances, but when researching a mismatch, all you care about is the logins. You can define how to clean these values in an experiment:
class MyWidget
include Scientist
def users
science "users" do |e|
e.use { User.all }
e.try { UserService.list }
e.clean do |value|
value.map(&:login).sort
end
end
end
end
And this cleaned value is available in observations in the final published result:
class MyExperiment
include Scientist::Experiment
# ...
def publish(result)
result.control.value # [<User alice>, <User bob>, <User carol>]
result.control.cleaned_value # ["alice", "bob", "carol"]
end
end
Note that the #clean method will discard the previous cleaner block if you call it again. If for some reason you need to access the currently configured cleaner block, Scientist::Experiment#cleaner will return the block without further ado. (This probably won't come up in normal usage, but comes in handy if you're writing, say, a custom experiment runner that provides default cleaners.)
The #clean method will not be used for comparison of the results, so in the following example it is not possible to remove the #compare method without the experiment failing:
def user_ids
science "user_ids" do
e.use { [1,2,3] }
e.try { [1,3,2] }
e.clean { |value| value.sort }
e.compare { |a, b| a.sort == b.sort }
end
end
Ignoring mismatches
During the early stages of an experiment, it's possible that some of your code will always generate a mismatch for reasons you know and understand but haven't yet fixed. Instead of these known cases always showing up as mismatches in your metrics or analysis, you can tell an experiment whether or not to ignore a mismatch using the ignore method. You may include more than one block if needed:
def admin?(user)
science "widget-permissions" do |e|
e.use { model.check_user(user).admin? }
e.try { user.can?(:admin, model) }
e.ignore { user.staff? } # user is staff, always an admin in the new system
e.ignore do |control, candidate|
# new system doesn't handle unconfirmed users yet:
control && !candidate && !user.confirmed_email?
end
end
end
The ignore blocks are only called if the values don't match. Unless a compare_errors comparator is defined, two cases are considered mismatches: a) one observation raising an exception and the other not, b) observations raising exceptions with different classes or messages.
Enabling/disabling experiments
Sometimes you don't want an experiment to run. Say, disabling a new codepath for anyone who isn't staff. You can disable an experiment by setting a run_if block. If this returns false, the experiment will merely return the control value. Otherwise, it defers to the experiment's configured enabled? method.
class DashboardController
include Scientist
def dashboard_items
science "dashboard-items" do |e|
# only run this experiment for staff members
e.run_if { current_user.staff? }
# ...
end
end
Ramping up experiments
As a scientist, you know it's always important to be able to turn your experiment off, lest it run amok and result in villagers with pitchforks on your doorstep. In order to control whether or not an experiment is enabled, you must include the enabled? method in your `Scientist
Related Skills
node-connect
341.2kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
84.5kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
341.2kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
commit-push-pr
84.5kCommit, push, and open a PR
