Often, Less is Better!

When I was learning Ruby, I was searching for things that could be improved. I was looking for ways I could make things better. One of the areas I was involved in was testing. This should come as no surprise. If you are writing Ruby code, you need to be writing tests. The exact order in which this is done is not my point. Just that tests must be written and run frequently. Here is a typical test run using the minitest facility:

Standard Test Output

To my mind, this was not communicating that true progress of the tests. Consider: What version of minitest was in use? What files were being processed? How many tests were in each of those files? Most of all, why does this output look so boring?

Yes; boring! So being bored I thought: Surely I can do better! I came up with a gem I call minitest_visible to spruce things up. Here’s what its output looks like:

Enhanced Test Output

Much better! Clearer! Nice progress! and so entertaining? Well at least I thought it was a big improvement. It would seem that not many people shared my view. I can see their points. Notice that the bottom line is the same in both cases:

21 runs, 92 assertions, 0 failures, 0 errors, 0 skips

This result line is what counts. These are tests, not video games. They should test the code without a lot of bells and whistles and let you get on with the real work.

I was also told that my little gem did not fit in with the minitest ecosystem. I asked what was meant by this but I never got a reply. I suppose somebody was channeling their inner Linus Torvalds of rudeness that day. I’ll never know. It also does not matter.

I have finally come to realize that the critics were right. The best answer is to keep things as simple and lean as possible. No being fancy; no showing off. To that end I have worked to remove the minitest_visible augmentation from almost all of my work.

Affected gems are: composite_rng, counted_cache, fibonacci_rng, flex_array, format_engine, full_clone, full_dup, fully_freeze, in_array, insouciant, lexical_analyzer, make_gem, mini_erb, mini_readline, mini_term, mysh, parse_queue, pause_output, safe_clone, and safe_dup. The exception is the fOOrth gem experiment that has so many test files that it would be hard to sort out what was happening without a little help. At least for now.

Now these gems can be used by others without the hassle of having to install a non-standard testing extension.

Is less more? I’m not sure. Is less better? Very often, yes it is.

Your truly

Peter Camilleri (aka Squidly Jones)

Updating Dependencies

There are things you don’t want to see when you review your code. Here’s one:

Potential Security Vulnerability Warning Alert

What’s worse, this message was hard to track down. There was no way (that I found) to have GitHub tell me about all the code that was affected. I’d have to check each one manually until I had found and fixed them all.

I had several things going for me. The alerts were all caused by one thing; the discovery of a bug in the “rake” gem. Here’s one version of the bug report I found:

It was discovered that Rake incorrectly handled certain files. An attacker could use this issue to possibly execute arbitrary commands.

USN-4295-1: Rake vulnerability

Now, in theory, one could do an in depth analysis of this bug and conclude that it has no impact on your code. I do not recommend this approach. I would not have confidence that I could see all the devious ways this fault could be exploited. Further, I know I cannot possibly predict every move by the criminals. That is why I will now focus on ensuring that my code specifies a version of rake that does not have this defect. That is we need to specify versions of rake with version greater than or equal to 12.3.3.

I write Ruby Gems and for those gems, dependencies are specified in the <gem_name>.gemspec file. Let’s see some rake dependency entries and see how they stack up to our security requirement:

spec.add_development_dependency “rake”, “~> 10.0”

This was the dependency entry that generated my alert messages. To translate into plain English, it says that any version of rake is OK, so long as it was version 10.something. The last such release was 10.5.0 which clearly has the bug and thus generates a warning. This entry clearly must be replaced.

Now many of my Ruby Gems had the following entry that did not result in a warning, but still leaves something to be desired:

spec.add_development_dependency “rake”, “~> 12.0”

This entry says to accept any version of rake so long as it was version 12.something. The last such release was 12.3.3 which is clear and free of the bug! So, yes, this is much better, but it suffers from the fact that it limits rake to version 12. Version 13 is already out and forcing the use of an older version may not be a good idea. I chose to replace these entries as well.

So, we now examine the simplest and most permissive specification:

spec.add_development_dependency ‘rake’

This translates to anythings goes, but use the very latest if it’s not already installed. This is very simple, it generated no warnings, and it can get the latest. It will also settle for much older versions, including those with the bug. This too must be updated.

This is the entry recommended for use by the alert:

spec.add_development_dependency “rake”, “>= 12.3.3”

This translates to “Anything from 12.3.3 or later”. This ensures that we will not use older versions of (buggy) code, and that we won’t exclude newer versions of code either.

Now there may be legitimate reasons to stick to older code. I know there are working web sites using ancient versions of rails running on equally old versions of ruby. Even so, there are (too many to list here) real risks to running obsolete code. If at all possible, keep up-to-date. It may require some effort, but it is worth it. Test your code with newer versions before pushing them out into the wild. If problems/issues arise, fix them and update your dependencies. This usually can be accomplished in a modest amount of time. If it can’t then you need to lock in the obsolete code dependencies and take a serious look at re-engineering your application.

In the end, everything needs maintenance. The reason we all avoid gems that have gone many years with no updates is that we know that nobody is tending house. I was caught by surprise and it is hard to anticipate what the next disruption will be, but by defensively coding, the risk can be reduced.

Finally let me take a moment to list which of my gems needed to be updated due to this issue. If you have a current, safe version of rake installed, you need do nothing even if you are using these gems: composite_rng, connect_n_game, counted_cache, fibonacci_rng, format_engine, format_output, full_clone, full_dup, fully_freeze, in_array. insouciant, lexical_analyzer, make_gem, mini_erb, mini_readline, mini_term, mysh, parse_queue, pause_output, ruby_sscanf, safe_clone, safe_dup, test65, and vls. In addition, fOOrth, games_lessons, and rctp were updated but the changes are currently part of unreleased code.

Yours Truly

Peter Camilleri (aka Squidly Jones)

Survey Results: Introduction [Updated]

In April of this year, I completed gathering the raw data for my survey of gem downloads from my postings at the RubyGems web site. In the past, I have published progress reports at the 12 week and 24 week progress points. I am glad that the data collection is finally done. I admit, that it was only about an hour a week, but after several months it began to take on a level of tedium and boredom.

To be clear, data was collected for all 52 weeks, however, the data for week, 41, was lost in system crash. The data for that week is interpolated from the data of weeks 40 and 42.

The $65,536 question is WHY? Data collections and studies are are a lot of work and normally only undertaken with a goal in mind. My goal was simply to understand who was downloading the Ruby code I was writing and giving to the world. Unlike really popular authors, I did not have the luxury of looking to see thousands of downloads a day. My numbers were quite meager. My hypothesis was that by gathering detailed data, over a reasonably long span of time, some truths could be gleaned from that data that would reveal the nature of the user base. That is, assuming such a user base even existed.

You see, my fear is that I am just a crazy lunatic, working away in complete isolation, writing code that nobody will ever read. Just thinking about this scenario makes breathing difficult and causes my gut to twist and contort painfully.

The data itself reveals a brighter side. While subtle, there are some signs that there real are people out there at least looking at this code, and maybe incorporating it into their own projects.

There is a lot of data, too much for this blog or any single posting. In the coming days, I plan a number of articles looking at the data and coming to some concrete conclusions about Ruby Gem downloads and how to gage the success of a code release and maybe some tips for success in the world of software components.

Best regards

Peter Camilleri

ps: My library of gems can be found at Ruby Gems and the source code is at GitHub.

Unconventional tools, #2

Hi All;

Now, we’ve all seen code that at least appears to work, but is very poorly written. You might even say the code “stinks”. Now, it can be really hard to detect code smells in your own code. A major air freshener company speaks of people going “noseblind” to smells. The old solution to this problem was to have code reviews. This never works. People gloss over problems, the reviews take forever, and feelings can be hurt or axes can be ground. What is needed is an objective, easy-to-perform, review of the code. What is needed is an automated tool!

For me, programming in free ruby, the tool is called… reek!

Now you may think: “Why do I need a tool to complain about my working code?” I must admit, I sometimes think that too. The answer to that question is that whenever I dig into the code and clean up the smells, the result is almost always much better code. Now, I know that’s a serious claim, so I am going to back it up with a real example from my own code. The following bit of code is used to process snippets of ruby code embedded in strings and surrounded by {{  }} structures. They generally go by the name “handlebars”. Here is our starting point:

#Process a string with code embedded in handlebars and 
#backslash quotes.
def eval_handlebars(in_str)
  out_str = ""

  loop do
    pre_match, match, in_str = in_str.partition(/{{.*?}}/m)
    out_str << pre_match
    return out_str.gsub(/\\\S/) {|found| found[1]} if match.empty?

    code = match[2...-2]
    silent = code.end_with?("#")
    result = instance_eval(code)
    out_str << result.to_s unless silent
  end
end

 When I ran reek against this code, it made the following observations:

lib/mysh/user_input/handlebars.rb -- 2 warnings:
 [25, 27, 32]:FeatureEnvy: Object#eval_handlebars refers to
  out_str more than self (maybe move it to another class?)
 [19]:TooManyStatements: Object#eval_handlebars has approx 10 
  statements
2 total warnings

Well, the code is pretty ugly, so perhaps this is not surprising. So; how to proceed? The first clue comes from the comment line.

#Process a string with code embedded in handlebars and 
#backslash quotes.

This method is doing two distinct things! Maybe it should be two distinct methods? So here’s the first fixup:

 
#Process a string with code embedded in handlebars and 
#backslash quotes.
def eval_handlebars(str)
  do_process_handlebars(str).gsub(/\\\S/) {|found| found[1]}
end

private

#Process a string with code embedded in handlebars.
def do_process_handlebars(in_str)
  out_str = ""

  loop do
    pre_match, match, in_str = in_str.partition(/{{.*?}}/m)
    out_str << pre_match
    return out_str if match.empty?

    code = match[2...-2]
    silent = code.end_with?("#")
    result = instance_eval(code)
    out_str << result.to_s unless silent
  end
end

The new method do_process_handlebars just does handlebars while the original method takes that result and processes backslash quotes. So what does reek think of the code now?

lib/mysh/user_input/handlebars.rb -- 1 warning:
 [26]:TooManyStatements: Object#do_process_handlebars has approx 9 
  statements
1 total warning

This is a lot better, but now my new method is too long! What to do? I could ignore the problem; I could tell reek to just ignore the problem; or I could mashup the code and use sneaky tricks to make it use fewer lines. None of these are good choices here. Instead, let’s really look at what the code is doing. We see a loop, processing a string by repeatedly looking for a regular expression and then performing a substitution of the found text with new text. Wow! Did I really write that code? What we have here is a kludgy reinvention of the gsub method! The very same method already used to perform the backslash quoting. Zounds!

OK; so let’s see what happens when we replace the kludge of gsub with that actual gsub method:

#Process a string with code embedded in handlebars.
def do_process_handlebars(str)
  str.gsub(/{{.*?}}/m) do |match|
    code = match[2...-2]
    silent = code.end_with?("#")
    result = instance_eval(code)

    (result unless silent).to_s
  end
end

 Wow! That code is MUCH better looking! What does reek think about it?

0 total warnings

No more code smells found, well at least by the reek tool. I did make one further change. I corrected the ambiguous top comment line:

#Process a string with backslash quotes and code embedded 
#in handlebars.

It would seem that even automated code scanning tools do not check for poorly written comments.

So, I think it is pretty clear that the new code is much better than the original. I can tell you that it also runs faster and uses less memory. Can we draw a conclusion from all of this? How about:

When things smell bad, put away the scented air spray and just clean up the mess!

Best regards;

Peter Camilleri (aka Squidly Jones)

Notes:

  1. In the quest to write better code, I am inspired by Sandi Metz. An awesome video by her on the matter of smelly code is Get a Whiff of This by Sandi Metz.
  2. Some code was slightly reformatted to fit into the blog post.

 

 

 

 

Gem Download Study: 24 Weeks

Well 12 more weeks has passed and it is time for the next installment of the Ruby Gems Download study. The goal of this study was to see if it was possible to observe patterns in the download rates that might lead to useful conclusions about the mix of entities doing the downloading. The gems repository makes this download data readily available. For my gems, you can see that data here.

For the 24 week report there are two main data findings. A graph of individual, cumulative downloads for each gem for 24 weeks and a graph of the weekly downloads for all gems for the same period.

gem_downloads_24

I have collected 24 weeks of data, so here is the graph showing the rate of downloads.

If you look, you will see that the slopes of the lines vary. Some lines are very flat, while others are sloped upward at a much sharper angle. This means that the rate of downloads is also different.

weekly_downloads_24

The weekly, combined results show that the rate of downloads is anything but constant. The valleys seem to correspond to periods this summer when large numbers of people would be on vacation (Not me of course; I’m never/always on vacation)

The spiky peaks do tend to correspond to times when large numbers of gems were mass updated. This is especially the case for the last and fourth from last points where a code of conduct and changes to the mini_readline gem were propagated to most of the gems. This is clearly indicative of automated downloading.

This study shall continue. I estimate that the next posting on this matter will be one to study an entire year’s worth of data.

Until then, Many Thanks and Best Regards;

Peter Camilleri (aka Squidly Jones)

Unconventional tools, #1

Hello all;

In this series of posts, we shall take a look at some unconventional tools that help solving various programming (and other) problems that often come up. For this first article, I want to tackle a problem that actually made the list of the two hardest problems in computers science. Here’s that list:

  1. Selecting meaningful names.
  2. Cache coherency and invalidation.
  3. One off errors.

Ignoring the one-off error in the list, the first issue is selecting meaningful names. Way back in the bad old days, it was considered acceptable to have a program full of entities like i, j, k, and pmz21. Variable names that were meaningless, supported by a mountain of comments that were often completely out of data and/or misleading.

Slowly, computers improved and space was no longer so violently cramping the writers style anymore. This however lead to the opposite problem of names like:

if (index_selection > the_number_of_selections_available_in_the_menu)
  display_a_pop_up_error_mesage("Oops");

Names so verbose that they obscure the intent of the code.

What we really need is names that are meaningful and not verbose. It boils down to a matter of selecting the right words. Regardless of what programming language I use, I program in the english language. There are over a million words defined for that language and I can assure you the I do NOT know them all. For those where english is not their native tongue, the problem is even worse.

What to do? Use a tool to help navigate the possible words for the job: Thesaurus.com!

Let’s try a real example I encountered just this morning. I was looking to update the description of my mini_readline ruby gem. I wanted to describe the fact that it came with four sample auto-completing thing-a-ma-bobs. The only word that came to mind was “engine” that sort of worked but sounded to mechanical. I wanted a word that would express the idea of work done on the user’s behalf. So of to the thesaurus web site and punched in engine… A large page of results including the word I sought for: agent!

So the next time you struggle with one of the most difficult tasks in programming (and writing too) consider using a free tool to ease the burden of picking that one in a million perfect word for the job!

Best regards;

Peter Camilleri (aka Squidly Jones)

Announcing the vls utility.

One of the joys of modern programming is the ability to utilize external code libraries to speed up development, saving time, and reducing wasted effort. The Ruby language is especially blessed in this regard with its system of code gems. Instead of the narrow minded NIH (not invented here) mindset, the world has become our tool chest.

There is, however, a downside to this modular nirvana. Versioning! To be precise: Am I using the correct version of each (and every) little code gem? This problem goes back all the way to the old Visual Basic days. Back then it was called DLL Hell, as developers struggled to maintain a myriad of cryptic, often poorly documented binary files.

Now Ruby does have the bundler utility that allows gem versions to be specified, but what if you simply want to know: When I use this application, what modules/classes are being used?

The vls utility answers that question. To use this simply enter:

$ vls <names>

where names are a list of gems/files to be required before the modules are listed. Here, see an example of this in action:

$ vls fOOrth
vls (VersionLS): 0.1.0

Bignum, 0.0.5
Complex, 0.0.5
Date::Infinity, 0.0.5
FalseClass, 0.0.5
Fixnum, 0.0.5
Float, 0.0.5
FormatEngine, 0.7.2
FullClone, 0.0.5
Gem, 2.2.2
InArray, 0.1.5
Integer, 0.0.5
MiniReadline, 0.4.8
NilClass, 0.0.5
Numeric, 0.0.5
Rational, 0.0.5
Regexp, 0.0.5
RubySscanf, 0.2.1
SafeClone, 0.0.3
Symbol, 0.0.5
TrueClass, 0.0.5
XfOOrth, 0.6.1

The vls gem may be found at: https://rubygems.org/gems/vls and the source code lives at: https://github.com/PeterCamilleri/vls.

Yours Truly

Peter Camilleri (aka Squidly Jones)

The Clone’s Family Tree [Updated]

In Ruby, most data is accessed by reference. That is, variables contain a sort of hidden pointer to the data, not the data itself. This is very efficient but has a hidden trap. Assignment does not copy the data. It only copies the reference! Consider the following snippet of code:

a = 'foo'
b = a
a << 'bar'
puts b

What do you think the “puts b” statement will print? Turns out it’s “foobar”. Since only references were copied, when the original variable (a) was modified in the third line, both variables (a and b) were “mutated”.

So how does one copy values and not just references? The Ruby programming language has two methods for duplicating data. These are “dup” and “clone”. While these methods are quite useful, they both suffer from two shortcomings:

  1. In Ruby, if an attempt is made to clone (or dup) an immutable data item like a number, an error occurs. The justification for this uncharacteristic strictness is not at all clear, but it does mean that the clone (or dup) operation must be applied with great care.
  2. The copying process used by both clone and dup is said to be a shallow (or incomplete) copy. While the target data structure is copied, any internal data structures are not. References to those data remain aliased in the copy.

I started off to create a gem to resolve these issues. I ended up creating a family of  four gems that are tailored to the exacting data copying requirements of the application. Here is a summary of those gems:

Family Chart
Depth / Action Need to copy all data and metadata attributes? Need to copy data only?
Only need a shallow copy? Use the safe_clone gem. <Source> Use the safe_dup gem. <Source>
Need a full, recursive copy? Use the full_clone gem. <Source> Use the full_dup gem. <Source>

Notes:

  • Since none of these gems override the default clone and dup methods, the default behaviors remain available. Further, if multiple, differing requirements exists, more than one family member gem may be employed in the same project without fear of conflict.
  • If multiple family gems are employed, they will each need to be installed and required into the application. See each gem’s github source for details.
  • Meta-data attributes include the frozen status and singleton methods. However the tainted status is always copied.

I hope you find these little gems as useful as I have found them to be. I you like them, give the code repository a star as a sign of approval!

Yours Truly

Peter Camilleri (aka Squidly Jones)

fOOrth

fOOrth_logoIt is with great pleasure that I am finally able to announce the release the initial beta of the fOOrth programming language system version 0.5.0. This language, written entirely in pure Ruby is released both as a Ruby gem and in Source Code on github.

So, what exactly is the fOOrth programming language system anyway? The source code repository has this to say about the project:

The fOOrth language is an experimental variant of FORTH that attempts to incorporate object oriented and functional programming concepts. It also tries to extrapolate an alternate reality where FORTH was not frozen in the past, but continued to grow and develop with the times. Above all this project is the result of nearly 30 years of thought on the design of threaded compilers and languages with simplified grammars and syntax.

Also included in the docs folder are a User’s Guide and Reference in both open office and PDF formats.

A lot of work and effort has gone into this personal labour of love. For example, I have discovered that writing good documentation is a lot harder than it looks. It is certainly harder than just writing code. I have also found however that having to explain what is going on in plain, simple, and easy-to-understand terms has resulted in many improvements to the code that would not have happened otherwise. In effect, it served as a sober, detailed review of the code being generated.

This is still only a beta, and there is still a great deal more work to do. I hope you find it useful or at least interesting. If you like what you see, give the code a star to show your support. Suggestions, comments, complaints and ideas are always most welcomed.

Yours Truly

Peter Camilleri (aka Squidly Jones)

The redesigned minitest_visible gem (v0.1.0)

Just because code works, doesn’t mean it’s the correct code. A case in point is my own minitest_visible gem. This little bit of Ruby code adds simple progress tracking to the testing process. In any programming, and especially in a  dynamic language like Ruby, testing is vital. So is having faith in those tests.

Now the standard minitest testing system is awesome! It is full of helpful methods for confirming the correctness of code. It is however a little terse regarding progress. That is where minitest_visible comes in. When used, it lets you know the version of minitest being used and prints out the name of each test file as it is processed.

A fairly simple task. Yet, when I came up with the first version of this code, it was clunky and ugly, but it worked. I suppose I had other matters pressing at the time, but I let things stand at “good enough”

Recently, I began to think that there must be a way to make the code better, cleaner, and easier to use. It seems that I have indeed learned some stuff in the last year because I am now writing about a new version that is far simpler and streamlined.

OK, let’s be clear: The old way of doing things is still supported, but prints out a message about the needed changes. Test still pass though so disruption and panic should be minimal.

The new minitest_visible can be found https://rubygems.org/gems/minitest_visible and the source code is at https://github.com/PeterCamilleri/minitest_visible.

Yours Truly

Peter Camilleri (aka Squidly Jones)