Jun 8, 2020
Perhaps it is not news to everyone that there are two hard things in Computer Science. In my last post, I explained why I felt programming is not boring because there is some art in doing it[1]. I will not spend any time here arguing whether naming things is an art or not. Instead, I will just go straight into how to name things properly. All the examples below uses the Julia language.
To me, it is extremely important to write code that is readable and easy to understand. If I write it properly, my colleagues will be able to read it, support it, and enhance it going forward. If I write it badly, I may even curse at myself in the future.
I will not cover naming conventions such as CamelCase
or snake_case
, as they are fairly well documented and understood. Rather, I will focus more about the semantics.
What are the things that we need to give names to? In Julia, the main things are modules, data types, functions, constants and variables. The very first decision is to choose between using a verb, a noun, or an adjective.
This is actually not too difficult. A reasonable approach is to adopt the following convention:
Thing | Choice of Word |
---|---|
Modules | Noun |
Data types | Noun or Adjective |
Functions | Noun or Verb |
Constants/Variables | Noun |
A good module name typically encompasses a group of concepts and functionalities provided by the module. It needs to be broad enough to cover all functionalities but specific enough to distinguish itself from other modules.
For example, I have a CircularList
module that provides an implementation of a circular linked list. I do not name it as LinkedList
, nor would I name it as List
, for those are too broad of a concept.
Sometimes you want to think bigger and name the module a little broader than what is currently implemented. I think it is appropriate as long as there is a plan to develop the package further, and it does not cause any confusion for any user of the package. But, it should not miss the mark.
Additional guidance can be found at the package naming guidelines documentation.
Data types are used to represent either abstract concepts (abstract types) or concrete data structures (concrete types). That's why using nouns would be appropriate. In the case that data types represent behavior or traits, it is appropriate to use adjective such as Iterable
and Pretty
.
A general rule of thumb is to come up with names that are not too long. I would say any name that is composed of more than three words is code smell. So, ProductCatalog
is perfectly fine and even ElectronicProductCatalog
is acceptable but ElectronicDepartmentProductCatalog
would be too long. Why is it bad? It is because long names are harder to read and spelling mistake are hard to find, causing headaches during debugging.
For functions, when should I use verbs? And when should I use nouns?
There is no hard-and-fast rule. Based upon my experience, I have developed an intuition to make these kinds of judgment. Let me illustrate my thought process with an example.
Let's say I am working on an e-commerce site and I have a Product
data type. I want to define a function that determines the price of the product based upon the current market condition. I can think of two names - current_price
and calculate_price
.
Now, consider the code below:
for product in products
if current_price(product) in price_range
add(search_result, product)
end
end
Does it look better or worse if I choose calculate_price
instead? I would say worse. The reason why current_price
looks better is that it's easier to read. I can read the code as "if current price of the product is in the price range, then..." When I use current_price
, I'm emphasizing the thing that is being returned from the function rather than how it does its job.
At this point, one may argue that the issue can be resolved with a temporary variable:
for product in products
price = calculate_price(product)
if price in price_range
add(search_result, product)
end
end
That's true but it limits the way you write code, so I can't say it's optimal.
Next, let's consider the case that I need a new function that find the best price of the product using several different pricing algorithms. So, the pricing function needs to take an algorithm argument. The function now reads:
"Return the current price of the product calculated using the specified algorithm."
current_price(product::Product, algorithm::Algorithm)
At this point, the code starts to feel just a little more awkward to read than before. I may lean towards using a verb instead:
"Calculate the price of the product using the specified algorithm."
calculate_price(product::Product, algorithm::Algorithm)
Using a verb makes it more natural when you need to emphasize the action being taken to get the job done. In this example, the use of calculate
goes together with the argument algorithm
quite well.
So, my point is, what do you want to emphasize when calling the function. Is it the thing that it returns, or the process that it goes through?
You should be able to make a better decision when you can answer the above question.
I believe the best code should be readable. More often than not, I have heard from other programming experts about using meaningful words for all variables in the code. Religiously. I can agree to that most of the time, until the verbosity kicks in and kills readability.
I can think of two examples where it makes sense to use terse names.
The first case is code that performs heavy-lifting mathematical calculations. It is usually more natural to use math symbols then English words. Julia has full support of Unicode and this is an area where it shines. Here's an example from my BoxCoxTrans.jl package:
function transform(𝐱, λ; α = 0, scaled = false, kwargs...)
if α != 0
𝐱 .+= α
end
any(𝐱 .<= 0) && throw(DomainError("Data must be positive and ideally greater than 1. You may specify α argument(shift). "))
if scaled
gm = geomean(𝐱)
@. λ ≈ 0 ? gm * log(𝐱) : (𝐱 ^ λ - 1) / (λ * gm ^ (λ - 1))
else
@. λ ≈ 0 ? log(𝐱) : (𝐱 ^ λ - 1) / λ
end
end
At current state, the code looks fairly clean and maybe even resembles the formula found in a math research paper. By contrast, it would look very ugly if I have to replace all symbols with names like alpha
, lambda
, and time_series
.
The second case relates to the argument about naming indexing variables such as index
. I generally oppose this idea. For simple code, it does not matter. For something more complex, it goes against you and hurts readability.
Consider this code that I found from the LinearAlgebra package:
## Swap rows i and j and columns i and j in X
function rcswap!(i::Integer, j::Integer, X::StridedMatrix{<:Number})
for k = 1:size(X,1)
X[k,i], X[k,j] = X[k,j], X[k,i]
end
for k = 1:size(X,2)
X[i,k], X[j,k] = X[j,k], X[i,k]
end
end
Let's just do an exercise and see how it looks if we replace those short variable names with the more meaningful and longer version:
function rcswap!(index1::Integer, index2::Integer, matrix::StridedMatrix{<:Number})
for position = 1:size(matrix,1)
matrix[position,index1], matrix[position,index2] =
matrix[position,index2], matrix[position,index1]
end
for position = 1:size(matrix,2)
matrix[index1,position], matrix[index2,position] =
matrix[index2,position], matrix[index1,position]
end
end
If the purpose of using longer variable names is to improve readability, then this is doing more harm than good.
By contrast, using single-letter variable names is not too bad. In fact, it is quite common that single-letter variable names are used for indexing arrays in any programming language tutorials (e.g. python, java, javascript, and rust).
In addition, whenever I read code, my brain already translates them to read like index
. Really, there is no difference between i
, j
, k
and index
. They all look the same to me.
Is naming things actually difficult? Definitely, maybe.
You may think that I always name things properly? Wrong! I am definitely not perfect. Having written a book and a couple of blog posts, I have come to realize that writing code is no difference than writing in general. You must keep revising (refactoring) to make it right.
I do believe that practice makes perfect. I would encourage you to ask someone else to read your code and give you feedback.
P.S. For more tips in writing good code in Julia, consider picking up my book Hands-on Design Patterns and Best Practices with Julia.