This week’s episode of Fractals in Elixir has little to do with fractals and more to do with plain old Elixir. I cleaned up the way I’m processing the parameters for a fractal.
My code this week is in my mandelbrot
repo on Github tagged blog_2016_07_17
. Links to the previous articles are available in the README.
So I’ve been fighting a few problems pretty much from the beginning:
So, I admit it: I like YAML. I parsed YAML inputs for my Haskell version of the fractals program. I tried parsing YAML when I wrote my first Elixir version, but it didn’t go well. Then I discovered that poison
parsed JSON just fine.
I didn’t want to maintain both YAML and JSON versions, so I found some code that converts YAML to JSON, and I created a rake task around that. But I had to remember to run that rake task every time I changed a YAML file, and it was annoying trying to read the JSON files because they were all flat and very explicitly stringified.
More recently I discovered yaml_elixir
. The switch from poison
to yaml_elxir
was very simple. Instead of this:
json = filename |> File.read! |> Poison.Parser.parse!
I now parse YAML with this:
yaml = YamlElixir.read_from_file(filename)
So back in my Haskell program, I specified options for a fractal. The word/metaphor “options” may or may not have been a good decision for Haskell, but it was a bad idea for Elixir.
In Elixir and OTP, many functions have a options
keyword hash (e.g., GenServer.start_link/3
). I found that while I was writing my supervisors and servers, I was having trouble keeping straight what options
referred to: my options for a fractal or options for the OTP server.
I thought about using the term “config”, but Elixir programs already have a “config” which work more at a system level.
I thought about using the term “specification”, but it’s kind of long, and it would be too easily confused with testing “specs” especially since I’m using espec
for testing.
I thought about using the term “params”. I’m not sure I like the way “parameters” is often abbreviated as “params”,1 and it’s very difficult to make plural.2 However, “params” is common enough that I figured most developers would understand what it means: “params” are values that drive the program’s computation.
I went through the whole program and changed Options
to Params
and options
to params
. This turned out to be pretty easy and straightforward.
Params
come from one Enum
In my original Elixir program, I intended for flags specified on the command line to override params3 set in an input file. Instead, flags were only used to set a chunk size and change concurrency options, and they could not be specified in an input file.
Part of the problem was that the params needed to be built from two different sources: flags and an input file. I sort of had this code:4
def main(args) do
case OptionParser.parse(args) do
{flags, [params_filename, output_filename], _} ->
Params.parse(flags, params_filename, output_filename)
|> Params.open_output_file
|> Params.set_next_pid(self)
|> main_helper
_ ->
usage()
end
end
args
are the command-line arguments.OptionParser
is a standard Elixir library to parse command-line arguments.flags
are the flags specified on the command line.params_filename
and output_filename
are positional arguments from the command line.Params.open_output_file
and Params.set_next_pid
inject more params.main_helper
does the rest of real work.Everything here just seems like special handling to me. I feed parsed raw params (flags
) and unparsed raw params (params_filename
) and just a filename (output_filename
) to Params.parse
. But that one call into the Params
module isn’t enough; I have to do some ad-hoc additions to open an output file and set the next pid.
I tried a variety of things to clean this up. The breakthrough came when I realized that params_filename
and output_filename
and next_pid
were not separate entities to pass to Params.parse
. They were just raw params that weren’t specified as flags on the command line. Positional arguments are awfully convenient on the command line; some values are internal; wherever they come from, everything is a raw param.
def main(args) do
case OptionParser.parse(args) do
{flags, [params_filename, output_filename], _} ->
flags
|> Keyword.put(:params_filename, params_filename)
|> Keyword.put(:output_filename, output_filename)
|> Keyword.put(:next_pid, self)
|> Params.parse
|> main_helper
_ ->
usage()
end
end
Opening the file was really related to the filename itself and should be parse of parsing that raw param.
Keyword.put/3
puts the key-value at the beginning of the list and overrides any setting the key might have in the flags
.
Params
structParams.parse
used to parse the input file and the flags separately and with two different functions. It was a lot of special handling, as if they were two very different things.
Parsing the input file centered around a Params
struct:
def parse_input_file(params, json) do
%{params |
fractal: parse_fractal(json["fractal"]),
size: parse_size(json["size"]),
color: parse_color(json["color"]),
seed: parse_seed(json["seed"]),
upper_left: parse_complex(json["upperLeft"]),
lower_right: parse_complex(json["lowerRight"]),
c: parse_complex(json["c"], %Complex{real: 1.0, imag: 0.0}),
z: parse_complex(json["z"], %Complex{real: 0.0, imag: 0.0}),
r: parse_complex(json["r"], %Complex{real: 0.0, imag: 0.0}),
p: parse_complex(json["p"], %Complex{real: 0.0, imag: 0.0})
}
end
I had a different function for dealing with the flags:
def parse_flags(params, user_flags) do
flags = Keyword.merge(@default_flags, user_flags)
%{params |
chunk_size: Keyword.fetch!(flags, :chunk_size)
}
end
parse/3
brought these together:
def parse(flags, params_filename, image_filename) do
%Params{image_filename: image_filename}
|> parse_file(params_filename)
|> parse_flags(flags)
end
Default values are a mess. Some values like fractal
and color
have no default value (although they could and probably should). c
, z
, r
, and p
have default values because parse_complex/2
has an optional second parameter for a default value. chunk_size
has a default value because it’s a flag, and I have default values for flags. Three different ways to handle default values.
Ultimately, the problem was that I let the Params
struct drive the code. But that’s the thing I should be transforming and accumulating. The incoming raw params should be driving this code.
Iterating over a Keyword
list or Map
is the same as far as Enum
is concerned; they both implement the Enumerable
protocol. Flags come in a Keyword
list while a parsed YAML file results in a Map
. So I just need a Params.parse
that handles an Enumerable
:
def parse(raw_params, params \\ default) do
raw_params
|> Enum.reduce(params, &parse_attribute/2)
|> precompute
end
raw_params
is the Enumerable
.params
is a Params
struct, an accumulator.default
returns a Params
struct with default values in it.Enum.reduce
passes each key-value tuple to parse_attribute
which has three clauses.
The last and most general clause of parse_attributes
handles most attributes:
defp parse_attribute({attribute, value}, params) do
%{params | attribute => parse_value(attribute, value)}
end
parse_value
has five clauses. Here’s one of them:
defp parse_value(:fractal, value) do
String.to_atom(String.downcase(value))
end
So parse_attribute({:fractal, "Mandelbrot"}, params)
ends up adding :fractal
mapped to :mandelbrot
in a new Params
built off of params
.
Parsing the size is a bit interesting since it has to go from "1024x768"
to %Size{width: 1024, height: 768}
:
defp parse_value(:size, value) do
[_, width, height] = Regex.run(~r/(\d+)x(\d+)/, value)
%Size{
width: String.to_integer(width),
height: String.to_integer(height)
}
end
Several attributes are complex numbers:
@complex_attributes [:upper_left, :lower_right, :c, :p, :r, :z]
defp parse_value(attribute, value)
when attribute in @complex_attributes do
Complex.parse(value)
end
:seed
and :chunk_size
don’t need any special parsing, and there’s a catch-all case for them:
defp parse_value(_attribute, value), do: value
I love how simple and short these functions are, worried only about parsing one kind of data—and no default values!
For the output filename, I need to open an output stream:
defp parse_attribute({:output_filename, filename}, params) do
%{params | output_pid: File.open!(filename, [:write])}
end
I should probably also record the output filename in the official params, but the stream is absolutely necessary.5
Magic!
defp parse_attribute(:params_filename, filename, params) do
yaml = filename |> YamlElixir.read_from_file |> symbolize
parse(yaml, params)
end
Reading and parsing the YAML file is straightforward, simple, and oh-so-powerful: call Params.parse
recursively. I pass in the params
that have accumulated so far to the recursive call so that more params can be added.
I’m often struck by how powerful and simple recursion can be. Suddenly any raw attribute can be specified as a flag or in an input file. And since the solution is recursion, the input can be recursive. Yes, I can specify a params_filename
in my input files, and the settings accumulate across all flags and input files.
Overriding params is somewhat unpredictable. It mostly works the way I want (flags override anything in the files). But due to the way that Map
s sort their keys, you may get strange overriding if you parse one file from another file.
Only the last params_filename
in a file will be honored. It might be nice to specify more than one.
output_filename
can be specified more than once in input files, but only the last one encountered will get any data. All others will have a file handled opened for it that won’t be explicitly closed.
I have no protection from an infinite recursion reading input files.
Ultimately, all of these issues are not that important. I have much cleaner code, and it’s more powerful. So I’m going to declare params handling done for now.
I started to implement a few more fractals, but I discovered that I need to tweak the cutoff magnitude and the number of iterations in order for some of the fractals to look good. I’m handling these numbers inconsistently across several modules, and I’d like to be able to tweak them for each run, so I’m moving them into the Params
.
Also, I’m looking to solve some of my code duplication issues, especially in the escape-time modules.
I also don’t like “config” as an abbreviation for “configuration”. ↩
At one point, I was going to have a list of params and had troubles naming it: list_of_params
? Inconsistent, too long, and includes a datatype in the name. params_collection
? Also, inconsistent, too long, and includes a vague datatype in the name. paramses
? Too Gollum. ↩
When I made the “one Enum
” changes described in this section, the params were still called “options”. So you’ll see “options” used in the commit where the change was made, but I don’t think that transformation is that interesting. Look at the tag for this week to see the finished product. ↩
I really worked this code over so that it sucks in the right way for my story here, but if you go looking for it in my repo, it’s even worse. ↩
Params
has a close
function for closing any resources opened by the Params. For now, that’s just the one output file. If it’s not closed, the output file might not get properly flushed. ↩