Category Archives: Programming

On the Importance of ‘Technical Debt’

A couple of years ago, I was talking with a good friend of mine, we were talking about the difficulties of prioritizing the maintainability of software in a large organization development context.

And so, logically, the concept of ‘Technical Debt’ came up. Interestingly, he had never heard the term before[1], although as soon as he heard it, he grasped the importance.

(I remember it as being a really inspiring conversation, but sadly, my notes from that day don’t well capture what I found so inspiring about it. 🙁 )

Although the concepts of ‘clean up after yourself’ and ‘do it the right way’ are likely as old as human civilization, it was likely only after systems reached a certain level of complexity that the concept of ‘Technical Debt’ was really useful. There is a limit to how complex a mechanical system can get[2], and most other systems are amenable to human factors and psychological safety solutions.

It’s also interesting to think about what is different about software, that makes it: A) possible to make a working product with large (including conceptual) defects, B) Useful to ‘go into debt’ to get a product out the door faster (or more cheaply).

One wonders how much it is the sheer complexity of software systems, the number of interacting modules, or perhaps the number of layers involved, from OS to dev_tools, to language, to standard libraries, to 3rd party libraries, to user-made helper functions. Perhaps it is just that one can ‘go into debt’ in the uppermost layer, because there exists a good foundation.

It could also simply be that software is an automation of so many smaller tasks, that any human system as complex would have similar useful concepts of debt[3].

Doing a little bit of digging, it seems that the concept was first termed ‘debt’ sometime in 1992[4], but it was not until later that it was termed ‘Technical Debt’.

Articulating the concept of ‘Technical Debt’ has a number of important benefits:

1) It puts a name on the category of ‘things we want to clean up in our code’, adds an urgency, and calls out more precisely why this is important.

2) It links the concept of ‘do things the right way’ with ‘Business’ concepts of money. This enables much better CTO-CFO conversations, allows better and more informed project funding decision making, and (hopefully) enables better and more structured funding for Technical Debt reduction[5].

3) It enables conversations in the moment, during architecture conversations and code reviews (and everything in between), where the parties involved can directly weigh/balance the time/resource costs of proper implementation with the opportunity costs of delaying time to market (or MVI/MVP[6]).

It will be interesting to see how organizations (and organizational decision-making) change as this concept spreads from ‘pure’ software companies.

[1] We theorized that this was because he had grown up in Hardware companies.

[2] I am not a Mechanical Engineer, and I’m happy to hear counterexamples, as well the conceptual frameworks used to address this… 🙂

[3] Such as ‘Organizational Debt‘.

[4] https://www.martinfowler.com/bliki/TechnicalDebt.html “As far as I can tell, Ward first introduced this concept in an experience report for OOPSLA 1992. It has also been discussed on the wiki http://wiki.c2.com/?ComplexityAsDebt.”

[5] My favourite label for this is the ‘FBI’ list[7], as in ‘Can you F****** Believe It?’, passed down to me by an executive from a famous Canadian software company.

[6] ‘Minimum Viable Increment/Minimum Viable Product‘, from various implementations of Agile theory.

[7] Things that might linger on a list like this include things filed ‘Too Dangerous to Fix’, which are often interesting memoir fodder.

The Mysterious Case of the Regex Dot

So, I’m in the middle of organizing my photos into folders, something more useable than the default Photos application on Mac[1].

While trying to count the number of photos/videos[2] in each subdirectory in my …/2018/ folder:

$ time find * |grep IMG|grep -o ‘^[0-9][0-9]/.’|uniq -c
22 04/0
3297 05/1
104 05/2
100 06/0
1830 06/2
2040 10/2

I first tried the supposedly logical:

$ time find * |grep IMG|grep -o ^..|uniq -c|head
1 04
1 /0
1 1/
1 20
1 18
1 04
1 01
1 -0
1 00
1 41

Interestingly, grep (and/or the OS) seemed to be taking the front off of each line, and then putting it back into the STDIN hopper for the next call to grep.

As this was not doing what I expected (nor wanted), I tried:

$ time find * |grep IMG|grep -o ‘^[0-9][0-9]/’|uniq -c|head
1 04/
1 01/
1 04/
1 01/
1 04/
1 01/
1 04/
1 01/
1 04/
1 01/

Which, while better…

$ time find * |grep IMG|grep -o ‘^[0-9][0-9]/’|uniq -c|sort|uniq -c
22 1 01/
22 1 04/
3501 1 05/
1930 1 06/
2040 1 10/
3297 1 13/
104 1 23/
1830 1 24/
2040 1 27/

…gave me too many results by about a factor of two, and somehow found 27 months in the year.

I quickly figured out that while parsing mm/dd/yyyymmdd-hash/IMG_[0-9][0-9][0-9][0-9].[FILETYPE], this particular grep/OS combination will happily grab the ‘mm/’, and then also grab the ‘dd/’. This habit, while charming, does not solve my problem.

After google searching https://www.google.com/search?q=grep+one+match+per+line proved unfruitful, I decided to try:

$ time find * |grep IMG|grep -o ‘^[0-9][0-9]/.’|uniq -c
22 04/0
3297 05/1
104 05/2
100 06/0
1830 06/2
2040 10/2

and it worked!

I was stumped, until I figured out that the issues that I had been seeing before were entirely because grep was finding results at the start of the newly chomped string, and that by chomping part of the next ‘match’, I was stopping grep from finding any more matches.

#themoreyouknow

[1] Right now, when Photos organizes photos, it puts each photo into its own folder, based on year/month/day/yyyymmdd-hash, which makes it super-annoying to use anything about the Photos app, which is super-slow and annoying to use.

[2] The images are all in the format ‘IMG_[0-9][0-9][0-9][0-9].[FILETYPE]’, where FILETYPE can be ‘PNG’ (screenshots), ‘JPG’ (camera pictures), ‘MOV’ (camera movies), ‘GIF’ (saved .gifs), or perhaps some other recognized image format.

If you wish to make a song from scratch, you must first Invent the Universe…

To write some music, you must first invent some instruments. To do this, one might start with a simple sine wave, then do modulations and superpositions to make various ‘instruments’.

To this end, I did a little bit of research (thanks, soledadpeandes!), and put/cribbed together some python code to make arbitrary .wav files:
# Written 2016-12-26, with special thanks to:
# https://soledadpenades.com/2009/10/29/fastest-way-to-generate-wav-files-in-python-using-the-wave-module/

import wave
import struct
import math

SAMPLING_RATE = 44100
WAV_FILE_LEN = SAMPLING_RATE * 1 # 44.1KHz sampling rate, 5 seconds
MAX_AMP = 32767
CORR_FACTOR = 10
SIN_WAV_FREQ = 100 * CORR_FACTOR # Sine wave frequency, in Hz*10 for some reason, 1000 gives 100Hz

output_file = wave.open('test.wav', 'w')
output_file.setparams((2, 2, 44100, 0, 'NONE', 'not compressed'))

for i in range (0,WAV_FILE_LEN):

data = MAX_AMP*math.sin(i*float(SIN_WAV_FREQ)/float(SAMPLING_RATE)/(math.pi/float(2)))
print data
packed_data = struct.pack('h', data)
output_file.writeframes(packed_data)
output_file.writeframes(packed_data)

output_file.close

For those who are curious, this generates a 100Hz sine wave: .

Next up, some experimentation with different pitches, perhaps different timbres. Stay tuned*!

*Also 100Hz.

Interview Questions: Types of Coding and Algorithm Questions

Part of a continuing series on Interviews and Interview Questions.

Today, we’re going to look at types of coding and algorithm questions. As discussed before, these can be divided up into ‘Problem Solving’ and ‘Knowledge’ questions.

As mentioned before, ‘Knowledge’ questions are very close to ‘human glossary’ questions. ‘What is the Big-O order of QuickSort? Average case? Worst case?’.

But there are some questions which straddle the line between knowledge and problem solving, answers that few but an expert in that topic would be able to exactly recall, like ‘what exactly happens between when you type google.com into your browser and the page appears?’, or ‘compare and contrast various sorting algorithms’.

For those questions, you have to be as widely read as possible, they tend to select for those who are more naturally inquisitive for things outside their specific area of expertise.

Now, for coding questions. There seem to be a few different types, which I’ll try to separate out by data structure[1]:

Arrays and Strings – Any data structure where any element is addressable in O(1) time, where elements are allocated together in memory.

Linked Lists, Stacks, and Queues – Data structures in linear form, where elements far away from the origin are O(N) difficult to access.

Trees – Data structures arranged in a tree form, with a clear root and directionality. Often sorted.

Graphs – Data structures with nodes and edges, where the edges can connect the nodes in arbitrary ways. Home to at least the plurality of the known NP-Complete problems. Note that Graph problems are a superset of the above.

Search and Optimization – Problems where you’re trying to find an optimal (or close to optimal) solution in a large multidimensional tensor or vector field. Many in this category are easily mappable to Graph Theory questions, but many are not, such as 3-D Protein Structure Prediction. Most interviews would likely not ask questions in this category, at least not very complex ones.

Machine Learning and Statistics – Somewhat related to Search and Optimization, problems dealing with how one trains a computer to solve a problem. Likely to become more and more important.

Hashes – Data structures where space is traded for speed. Generally assumed to have 0(1) insertion and retrieval

[1]Hat tip: developer.com

Interview Questions: Other

In previous posts, I’ve talked about the most important types of interview questions:

‘Behavioural’ questions ask ‘Describe a time when you encountered a problem like this’.

‘Situational’ questions ask ‘Given this situation, how would you solve it?’

‘Technical’ questions ask ‘Solve this defined problem for me.’

Today, I’ll cover some other types of questions that are known to not have much predictive power, but people still ask, either as an ice breaker, or because they have other reasons for asking these questions.

‘Ice Breaker’ questions ask ‘tell me a story about yourself, to help relax you.’

The purpose of ‘Ice Breaker’ questions is to get the conversational flow started. My personal favourite is ‘tell me about the project you’re most proud of’, because it will help to relax the candidate, and has the dual purpose of showing what a candidate is like when they’re excited about something.

Dumb’ questions ask things outside the normal boundaries of a standard interview.

From the link, examples might include “What kind of animal would you like to be?” or “What color best describes you?[1]” The ostensible purpose is to try to get beyond pre-programmed/rehearsed answers, looking for original thoughts. (I tend to prefer the ‘tell me what you’re most proud of’ type of question, as if you’re trying to knock a person off their rehearsed interview game, if they’re nervous, that might torpedo them, and you’re torpedoing them based on their interview skills, rather than actual skills. Better to choose a topic they know, and explore the limits of their thinking there.)

‘Illegal’ questions ask ‘I want to discriminate against you, in some illegal way’

Which questions are illegal will vary by jurisdiction, but generally include questions about things such as gender, age, marital status, religion, etc… Larger and governmental organizations tend to be better at not asking such questions, whether because of visibility or lawsuits. Knowing how to answer such questions can be tricky, because of the power differential between interviewer and interviewee, but especially because the organizations asking such questions may be hiring from a labour pool with few options.

‘Brainteaser’ or ‘Fermi‘ questions ask ‘How many piano tuners are there in New York’?

These questions are the stereotypical ‘Google interview’ question, which is funny, because Google no longer asks this type of question[2]. I happen to enjoy this type of question, and they can be very useful for back-of-the-envelope estimation, but don’t really have a useful place in job interviews.

Next time, we’ll go more in dept about specific types of technical questions. Stay tuned!

[1]My favourite story on this topic comes from the brainstorming exercise: “List all the things you could do with this brick.” People would come up with some small number of ideas (like <10) for how to use the brick. Then the facilitator would say something like: "List me all the ways that your wackiest friend could use this brick." Interestingly, this generally elicits many more ideas, as it removes some of the social opprobrium of being 'weird'. [2]cf. The British Empire no longer uses the ‘Imperial’ system.

Interview Questions: Technical

I’ve been writing about interview questions recently, most recently about ‘behavioural’ and ‘Situational’ questions. If you recall:

‘Behavioural’ questions ask ‘Describe a time when you encountered a problem like this’.

‘Situational’ questions ask ‘Given this situation, how would you solve it?’

‘Technical’ questions ask ‘Solve this defined problem for me.’

Today, I want to talk about ‘Technical’ questions. This includes two types:

‘Problem Solving’ questions, where the interviewer asks a technical question, and expects you to go through some process to solve it, similar in some way to what one would do in a job in the field.

‘Knowledge’ questions, where the interviewer asks specific questions about your field of study or work. For a programming job, they might be about memory management or data structures, for HR, they might be about what is legal or accepted practice in the jurisdiction in question, etc…

(Note that these generally don’t include questions about a resume, which I would group under the ‘Behavioural’ umbrella, as the interviewee is expected to tell a story about them.)

So what is an interviewer looking for in these questions?

For both of these questions, the interviewer is looking for command of the subject matter and problem solving ability. There’s a whole smear of possible questions between these two extremes. (‘What is an array’ to ‘Design LinkedIn’.)

For basic knowledge questions, it would probably suffice to re-read a textbook, or read (and understand!) a glossary of the topics one would be interviewed in.

For ‘Problem Solving’ questions, answers are generally more involved.

Generally, the interviewee is given a problem statement:

“Write a program which counts from 1 to 100, and outputs ‘Fizz’ when the number is a multiple of 3 and ‘Buzz’ when the number is a multiple of 5.”

This problem statement may or may not be well defined, so it falls on the interviewee to ask questions until it is adequately defined:

“Does it also print the number when it is a multiple of 3 or 5?” “Is proper syntax required?” “What language?”

(This also makes sure that the interviewer and the interviewee are on the same page.)

I like to draw a large diagram, and/or write down my assumptions in the upper-left corner when doing problems like this. Makes things explicit, people can see what you’re thinking.

One of my best bosses described his best programmer as ‘having a reason for every single line of code’. Talking through one’s code as it’s being written can help with this.

So:

Write down assumptions
Draw a big diagram
State the overall algorithm
Write down the solution, while talking about it
Think about corner cases, run an example through in your head.

Next time, we’ll talk some other types of questions, the kinds that are known to be not as predictive, but that interviewers still ask anyways, for various reasons. Stay tuned!

“Of All the Things I Miss, I Miss my Cache the Most.”

She was in the zone. It had taken two hours, half a RedBull (she would be paying for that later), and pissing off that guy who always seemed to want to talk longer than a conversation.

Free, flying through the code. There really was nothing like it.

She was working on a new DB caching layer for their server-side app. It was one of those ‘augmented reality’ games, but for corporate training. It still felt good to work on it though, to coerce the bits to bend to her will.

“Achievement Unlocked: You have met five new people in one day!”

Ugh. It sounded terrible, having to meet so many people all the time. To have to spend all that time and effort to convince them that the correct answer to a problem was, well, correct. If only they would just *see*.

But she had given up hope of being able to open peoples’ eyes. Give her code, or a nice juicy math problem, and she’d be content for hours, sometimes days.

*rumble* *rumble*

“Time to eat”, she thought to herself. She gets up, to go to the kitchen. Nyancore streams from her discarded headphones. As she turns, you can see the t-shirt she’s wearing says:

“Of all the things I miss, I miss my cache the most.”

She looks towards you and says “You can’t fault me for that.”

She laughs to herself and continues on her way.

“Senseless Juxtaposition of Wildcards.”

He had to admire the the gall of the programmer who wrote the error messages.

“Senseless Juxtaposition of Wildcards.”

It might as well have said:

“Grow a brain!”

Or:

“Try listening to classical music.”

But then it got him thinking…

What would be a senseful juxtaposition of wildcards?

First, we would have to make a list of possible wildcards:

The ‘standard’ wildcard character, specifically referring to a character is the question mark, ‘?’. Generally standing in for any one of some set of things (or in Perl, 0 or 1 of a thing).

The ‘larger’ wildcard character, ‘*’, which stands for any number of something (including 0), sometimes expressed as ‘%’, if you’re speaking SQL.

The ‘even larger’ wildcard character, ‘…’, which is like a recursive ‘*’.

But could there be something larger still? Something which climbs the directory hierarchy in the oppsosite direction, perhaps? Something which can make it past all of the automatic filters, but is clearly wrong? Something like typing ‘NaN‘[1] into a number field box? Something which steps outside the usual boundaries, like Thiotimoline?

In a biological context, there are entire alphabets of more-and-less-specific wildcards.

So, knowing all of this, what would be a senseful juxtaposition of wildcards? Something like ‘**’, or ‘?*’, or ‘*?’ would be meaninglessly equivalent to ‘*’.

You could attempt to mix SQL with bash-isms: “WHERE ID LIKE ‘%*’ “, showing that you expect an SQL character string followed by a bash character string, but that is again non-sensical.

Maybe it would have to be something like ‘hello??????'[2], to say that there are 6 characters of some type after your ‘hello’.

But there it was. The senseful juxtaposition of wildcards… bash statements inside command-line SQL statements.

That was it! But he had to think. How would he use this?

[1]And like the link says, you really don’t want to confuse it with NaN3. You really don’t want to confuse *anything* with NaN3.

[2]Or ‘hello……’.

Adventures in Mobile Phone Resurrection

My day, in a nutshell:

[Background:]

0) See that your phone may be having issues. No time to spend the week fixing it. Tape[1] it up and take it to Burning Man. Wait 10 months for the issues to become serious. Go on vacation. Drop it in the airport on the way there. Nursemaid it through the vacation, trying to read through the horizontal lines of ‘VGA cable is partially detached’. Drop it on the plane on the way home.

0.5) Take it in to the store. They say they’ll replace the phone for $100, but the data won’t be transferred over. I buy a new phone and go home to check my backup situation. (All my photos and videos are fine, it’s my notes and TTD lists that I’m most concerned about.)

0.7) Get home and find out my last full backup sufficient for a ‘Restore’ is about 10 months old. For some reason the ‘Sync’ doesn’t actually sync any useful amount of data, and there are no useful ways to gain finer control of this (I’m assuming) without jailbreaking the device.

0.8) Restore the device using the old backup. It’s really odd to see your last messages with someone that are 10 months old. Resolve to transfer over the rest of the data somehow…

[Next Day]

1) Start backing up the old phone. Discover that your phone needs some number of tens of gigs to do a full backup, and your computer only has 11GB free.

2) Rummage through your hard drive, using suggestions from helpful sites, finding another 15GB.

3) Figure out that 25GB is again not enough. Start taking a closer look at that 40GB backup from 10 months ago. Look at the directory, noticing that it contains about 40k files, each named with a 40 character hex hash. Try uploading it to back it up. After about half an hour, calculate that it will take 15-20 hours. Try tar -czvf to reduce the number of files. this doesn’t significantly help the upload time.

4) Decide to bite the bullet and delete the old backup. As there are too many files in the directory for ‘rm *’ to work, start with ‘rm 1*’, through to ‘rm f*'[2].

5) Start the backup of the old phone again, it finishes, and I start the restore to the new phone.

6) After waiting for a while, the phone has reset, and it gives me the option to ‘restore from backup’. More waiting. ‘The backup was corrupt or not compatible with this device or device version.’

7) Try again. More waiting. Once again having the new phone factory reset to start the restore. Once again: ‘The backup was corrupt or not compatible with this device or device version.’

8) Look in the directory to see if there’s something obviously corrupt. Wait a second…Some of those files starting with ‘0’ are from 2015…

9) Move all of the old remaining files to a new folder. Try the restore again. ‘The backup was corrupt or not compatible with this device or device version.'[3] Realizing that I had missed other files (not starting with 0-f), or that some of the 0-starting files had made their way into the list of files for the new backup, I delete all of the files in the directory this time.

10) Full backup. Full restore. Full day.

[1]Duct tape, of course. My little friend was suffering from an ‘expanded battery’, which eventually became bad enough that the screen became separated from whatever was feeding the screen data. Two drops later, it was unfixable, thankfully still okay enough on the inside to back up.

[2]You may notice something here.

[3]Interestingly, (from behaviour, not from looking in the files), the backup program (iTunes) blindly adds new files to the list of hashed files, and probably adds them to a list somewhere in that directory. It apparently doesn’t do much checking of the backup until it tries to restore it somewhere.

Running A Sprint Planning Meeting

It’s the little things that sometimes make a difference. When I was teaching standardized test math so many years ago, I noticed as I was drawing problems on the board, all the little habits that I had picked up. Habits which make solving problems easier, habits which reduce the chance for error.

Things like the curve on the leg of the lower-case ‘t’, so that it doesn’t look like a ‘+’. Curving your ‘x’ so it doesn’t look like a ‘*’ sign.

I think some of this (probably sometimes annoying) attention to detail had carried over to Sprint Planning meetings[1].

Planning Poker is a method for a group to converge on a time estimate for a task or group of tasks. There are a number of ways to do this. The ‘canonical’ way we were taught to do this was to use Fibonacci-numbered cards (1,2,3,,5,Eureka!). This involved a discussion of the task(s) to estimate until everyone had a reasonable idea of their complexity, then each person would choose a number estimate, all of which would be revealed simultaneously, to hopefully reduce bias. The discussion before estimation would not include estimates of how long things were estimated to take, to also try to reduce bias.

While we were running our planning meetings, I noticed that we would start to slip away from this ideal, perhaps because certain things were not important, perhaps because we didn’t see that certain things were important. For example:

We moved from cards to apps, and then to fingers. Using apps for estimation is less annoying than finding the cards each time, but fingers are even faster to find. I/we tried to get around the bias effect by having everyone display their fingers at once, and that worked reasonably well. Even making each person think about their estimate before display can help a lot with reducing the impact of what others might think of them.

One thing I tried which never really caught on when other people were running the meeting was saying ‘A,B,C’ instead of ‘1,2,3’, with the idea that it would be less biasing on the numbers people were choosing. (This may have mostly been an impression of mine, as the moving of the estimate from a mental number to a number of fingers may cement it in a slightly different mental state…)

If one is not careful, and perhaps somewhat impatient in meetings[2], one can start suggesting estimates before they are voted on. It can take considerable discipline and practice to not do this.

Another thing I noticed was how difficult JIRA was to use when one is not practiced in it, especially in a room with many people watching. Something that any experienced[3] demo-giver would know like the back of PowerPoint’s hand.

That’s all I have for now. For more minutiae, tune in tomorrow!

[1]For those of you who have not had the pleasure, these are the meetings at the start of an iteration, where the team sits down in a room, estimates a bunch of priority-ranked tasks, and decides (generally by consensus) how many of them they will commit to getting done in the next two weeks. Like all meetings, they can be good or bad, and the meeting chair (I feel) can make a large difference.

[2]I am probably as guilty of this as anyone. I would recommend Randy Pausch’s ‘Time Management‘ for those who feel similarly.

[3]Read: ‘Battle-scarred’