Category Archives: Programming

What are your Non-Negotiables?

What are your Non-Negotiables? Most recently, I was talking to someone[1] about my New Year’s resolutions, and we were discussing why I had done one of them, but not the other two. It eventually came out that the resolution that worked (writing every day this year[2]), worked because I had made it a Non-Negotiable[3]. I had resolved that no matter what, every day this year, I would write something. Somehow, every day, I would carve out an hour or two, pushing other things aside so that I could focus and write.

(Incidentally, this practice focusing has done wonders for me, helping me find ‘the zone’, or ‘flow’ much more consciously and easily.)

Sometimes I would push aside a computer game, or facebook, sometimes sleep, but those things didn’t matter compared to the commitment I had made (mostly to myself) to write every day.

Interestingly, the other Non-Negotiable that came to mind today was the 5-minute standup. I was talking to someone about it today, and they started to say ‘5-10 minutes’, and I had to interject, with talk of Non-Negotiables, how if you let something like that slip, pretty soon you’re having daily half-hour sit down ‘stand-up’ meetings.

Interestingly, biking to work every day is not quite a Non-Negotiable. I take probably a couple of weeks off each year, some for snow, some for rain, some for events. It’s pretty close, though, and I’m not so worried, because I’ve been doing it for long enough (14 years, I think), that it’s a pretty deep-seated habit.

So, what are your Non-Negotiables? What is the one thing you want to change this year?

[1]Pretty sure it was G at a life coaching session, but my brain has this annoying tendency to abstract things away, but that’s another post. I also remember it from a speech by the head counselor at music camp many years ago, but that’s another story…

[2]At least so far…

[3]This is a good precis from a life coach on Non-Negotiables.

Computers Win when the Rules are Fixed

One of the important reminders from game four of Alpha Go vs. Lee Sodol was the difference between what computers and humans are each best at.

Traditionally, computers were best at the most repetitive tasks, that were well understood and could be completely described.

If you talk to any release or test engineer, they will tell you that once you can fully describe a process, it’s only a few more steps to be able to automate it.

What makes Machine Learning so tantalizing is that it’s been giving hints of being able to learn to perform not-fully-described tasks, most recently Go.

At the same time, Machine Learning still requires thousands or millions of examples in order to be able to ‘see’ things, whereas humans can understand and make inferences with many fewer examples. It’s unclear to me (and I’m guessing most people) exactly why this is. It’s like there’s something about the way we learn things which helps us learn other things.

But back to the topic at hand. What game four showed us (yet again) is that the better defined the problem, the better humans perform vs. computers.

A different example of this is how high paid market research analysts are being replaced by automation, doing in minutes what would take the analysts days.

So, how do you stay relevant as things become more and more automated and automateable?

As Lee Sedol showed, one strategy is to play Calvinball[1]. Find the part of your discipline that is the least defined, and pour yourself into pushing that boundary, leaving defined pieces in your wake[2].

Note: Playing Strategema like Data is another ‘fun’ option[3], but most useful only when playing against a computer opponent, not so much for forging your own path. It consists of playing sub-optimal moves so as to confuse or anger the other player, to thrown them off balance. It is postulated that Deep Blue did this to Kasparov.

[1]Calvinball is a mostly fictional game invented by Bill Watterson for Calvin and Hobbes. The game has only one rule, that it can never be played the same way twice.

[2]Technically, Lee Sedol played a very ‘loose’ game, which was difficult to define, where parts of the board far away from each other were more easily related. You can also use this tactic to find things and do them in a way where humans are better than computers.

[3]We called this ‘victory through annoyance’ during undergrad. It had mixed reviews.

Go and Weaknesses of Decision Trees

Yesterday, we reported that an artificial Go player had defeated one of the top human players for the first time, in a best of five match.

Today, Lee Sedol responded with a ‘consolation win’, to make the score 3-1.

From this analysis of the game, it seems that (at least) two things were at play here (Hat tip PB).

The first is called ‘Manipulation’, which is a technique used to connect otherwise unrelated parts of the board. My understanding of it is that you make two (or more!) separate positions on the board, one which is bad unless you get an extra move, and the other which might allow you to get an extra move. Since the two locations are separate, the player has to have a very specific sense of non-locality in order to be able to play it correctly[1].

To me, this feels like an excellent example of why Go is so difficult to solve computationally, and why there is still much fertile ground here for research.

The second seems to be an instance of what is called the ‘Horizon Effect‘[2]. Simply put, if you only search a possible gameplay tree to a certain depth, any consequences below that depth will be invisible to you. So, if you have a move which seems to be good in the short term, but has terrible consequences down the road, a typical search tree might miss the negative consequences entirely. In this particular case, the supposition is that Sedol’s brilliant move 78 should have triggered a ‘crap, that was a brilliant move, I need to deal with that’, instead of an ‘now all the moves I was thinking of are bad moves, except for this subtree, which seems to be okay as far out as I can see’. The fact that at move 87 AlphaGo finally realized something was very wrong supports this hypothesis.

Is the Horizon effect something you can just throw more machine learning at? Isn’t this what humans do?

[1]Specifically, the idea that two things can be related only by the fact that you can use resources from one to help the other.

[2]One wonders what types of ‘Quiescence Search‘ AlphaGo was using that it missed this.

Beautiful AI and Go

Something monumental happened today. An artificial Go player defeated one of the top human players three times in a row, to win the best of five match

But I want to go back to game two, where Alpha Go played an inhuman and ‘beautiful’ move 37, a ‘very strange move’.

This is what it must be like to have one of your children, or one of your students surpass what you could ever do. You have given them all you can, and they take that and reform it into something beautiful.

They mentioned that Alpha Go plays the entire board at once, and so is more able to see unusual move possibilities like the one above. Fan Hui mentioned that he’s improved (from ranked 633 to in the 300s) as he plays against Alpha Go.

What else can deep learning teach us? What other amazing, inconceivable things will we learn from this new child which is just beginning to flower?

Beenary

It was never really taken seriously. It was most often expressed as a joke:

Q: What type of logic do hive dwellers use?

A: Beenary logic!

And this was true, to an extent. Bees did in fact use beenary logic. But like their honeycombs, it was a hexary, or six-valued logic system. As part of the ‘hive mind[1]’, they would dance in one of six directions for each hat[2] of information conveyed.

Most bee historians had indeed converged on the conclusion that bees were the true inventors of hexary logic, and were the first to answer yes or no questions in one of six ways.

So it was for this reason that ‘beenary trees’ had six children for each node, that a ‘beenary search’ would involve a bee making a ‘bee line’ out from a central hex, and ‘beenary star systems’ were much more complex.

Also, in their preferred computer language, the conditional operator was ‘Bees?’.

cah-bees

[1]Scuttlebutt has it that the bees always hated the term ‘hive mind’, both because “Yeah, we live in a hive, and we have minds. What of it?”, and because it was mistakenly applied to other colony forming insects.

[2]Binary uses ‘bits’ of information, the natural log uses ‘nats’ of information. Ergo…

Agile: Limitations of Scrum

Based on some recent experiences and conversations, I wanted to explore a couple limitations of Scrum that have come up recently.

1) Limitations of the Standup

Long-time readers will know that I’m a strong advocate of the 5-minute daily standup.

Where 5-minute daily standups are weak is in any discussion of the larger picture. If you try to have a reasonable discussion about the larger picture during a 5-minute standup, this will take far more time, and most of your group will have tuned out by the time you’re done. Conversely, under normal operation, your 5-minute daily standup is great at firefighting issues with currently open tickets, and seeing what is not being worked on, but unless you’re paying active attention, it’s easy to miss someone being semi-blocked or not speaking up to say that they’re blocked. This can happen often due to pride or embarrassment, often with inexperienced or new team members.

Proper use of one-on-ones is probably the best way to deal with this. But if your team member is shy and embarrassed about issues they’re having (especially if they feel technically inadequate), you may need to do some serious digging and active listening to ferret this out. This will require, but also help gain trust of your team member.

Another method is to have regular demos, perhaps daily or weekly, as part of your cadence, which will very quickly show who is having issues (or who does not know how to demo).

2) Limitations of having many small tasks

Organiation and prioritization can be difficult and time-consuming. It can often be easier to delegate all of it to the Product Owner (after all, they have final say on prioritization). However, they may not be the expert on which items should be split into smaller items, or they may try to determine too much of the ‘how’, instead of focusing on the ‘what’ and especially the ‘why’.

What this can cause is a disconnection between the ‘why’ and all of the rest of the team. The work can start to feel like an endless stream of little unimportant tasks, rather than a cohesive whole.

My tactic to deal with this would be to have a planning and prioritization session with everyone attending, so the entire team can be and feel involved, and so the entire team understands the why of each of the items (and can influence which ones will be done and not done!)

I was talking to a manager from another company earlier this week, and they had (I thought) a good idea for dealing with this type of issue. I would typify it as being somewhat orthogonal to Scrum, perhaps somewhat orthogonal to Agile. The concept is to give each member of the team a project of their own, where they are completely in charge. This way, they have complete ownership, and ostensibly will be much more engaged. I like this idea, and it helps a lot with the next point:

3) Scrum works well when team members are well-fungible

A lot of Agile and Scrum especially assumes that team members are reasonably fungible, sometimes almost to ‘Mythical Man-Month’ levels. If you have team members who have vastly different specializations (like Valve’s ‘T-shaped individuals'[1]), your prioritization may be greatly limited by their inability to map to your team members’ strengths.

The tactic mentioned above, where you give each team member a small (or large!) project of their own can help a lot here, but depending on how much of a disconnect you have between the composition of your team and what your organization requires, you may need to hire some people.

[1]If you have not read Valve’s Employee Handbook, go and read it now. I’m talking above about their description of employees who are ‘T-shaped’, which means that they have a broad set of skills, but are exceptionally deep in one area.

Seeing the Assembly Through the Cs

People always say that one of the important attributes of the C programming language is that you can look at the code and very clearly see the assembler that the compiler will produce.

But (I think) ever since I did assembler in class, no matter the language, I could see/map it to how the compiler would translate it to assembler.

In fact, it bothered me to no end. When I was learning Perl, it was very difficult at first to learn to trust the interpreter to take care of large swathes of the task, to let go of control. I could see all of the horribly inefficient things it must be doing behind the scenes (‘what do you mean, dynamic typing?!?’), and for a long time, my Perl would read very much like my C[1].

But then I discovered regex.[2] And learned about premature optimization.

Now my bash commands read like my Perl.

[1]It probably still does, but it’s starting to bleed back now. I’m now reminded of the dangers of buffer overflows whenever I use scanf.

[2]If you are reading this, you likely will have seen Regex Golf, or the first (that I saw) Regex Crossword. (LOOK THEY MADE MORE. THIS IS AMAZING.)

Similar but not Identical

How do you make a a playlist of songs which are similar, but not identical. Ideally, you want to play music that the user is likely to want to listen to*, but you probably don’t want to play the same song, even in different remixes over and over. So, how do you detect similarities, while removing identicals, even when they may not be so identical?

In practice, there is probably a lot of separation between the spike of identical songs and those that are merely similar. You could also use the Web 2.0 crutch of looking at what people searched after other songs, and/or the machine learning approach of trying to put songs after one another and seeing what people skipped or turned to from the suggestions instead.

Similarly**, cleaning data of artifacts is still an open problem. It feels to me like a similar one. You’re trying to remove this *huge* signal which is overwhelming your sensors so you can get at what you actually care about. Assuming both the artifacts and the signal are within your detection limit***, you have to determine the nature of the artifact, both where it is in the signal spectrum, and what axes it spreads through and how. It might also have related harmonics****.

Another related problem is the removal of 60Hz***** noise from all sorts of electronics. I’m not sure what sorts of filters are used, but even band reject filters have non-ideal behaviour, so perhaps smoothing the edges in a known way works better, but this is all speculation. I mostly like using the field around power cords to test oscilloscopes and to get people to think about electric fields.

But back to artifact removal. I don’t have particular insights right now, outside of specific problems spaces. I just think it would be a really cool problem to work on (and one that people work on in a specific way all the time).

*Or perhaps something just similar enough that you’ve been paid enough to play.

**But not identically,

***My favourite procedure/process is the one I learned from an analytical chemist, which is that the signal has to be 3x the noise for you to consider it signal.

****I’m using signal processing as an analogy, but the concept is the same for other artifact removal, just different math.

*****50Hz across the pond

Problem Solving Examples (With some Machine Learning)

So, in a previous post, (http://nayrb.org/~blog/2015/12/25/automation-and-machine-learning/), we talked about some methods to help you decide whether you actually needed Machine Learning or not to solve your problem. This post talks about some various different problem solving approaches and which types of problems they can make tractable.

I started my career fascinated by protein folding and protein design. By the time I got there, they had narrowed the question down to one of search: ‘Given this physics-based scoring function, how do I find the optimal configuration of this molecule’? There were a number of different techniques they were using: gradient descent, monte carlo, simulated annealing, but they all boiled down to finding the optimal solution to an NP-Complete problem.

As we know that biological systems can perform protein folding quickly, there must be some algorithm which can do this (even if it means simulating each individual electron). This can then be restated as a simulation/decision question, from the perspective of a cell/physics. Many other search problems have similar human-like or physics-like easier solutions (ways of finding the NP-Complete verifier). For example, as a traveling salesperson, you would look at the map, and be able to narrow down the routes to some smaller number, or be able to quickly narrow down the options to a small number of sets of routes.

In many ways, this is the ‘holy grail’ of Machine Learning, the ability for a machine to step away from what we tell it, and to be able to solve the problem in a more direct way. Heuristics are an attempt to solve this problem, but they’re always somewhat rules-based.

Next is clustering, best used for differentiating between different groups of things so that you can make a decision. My favourite is ‘Flow Cytometry’ https://en.wikipedia.org/wiki/Flow_cytometry, where you’re trying to differentiate different groups of cells, basically through clustering on a 2-D graph of the brightness of various fluorescent cell markers.

Customer persona clustering is another example, such as you might do for segmentation, where standard groups like age or location would not be good enough.

Machine Learning problems such as the Netflix challenge http://www.netflixprize.com/, where you want a large degree of accuracy in your answer, require the use of a number of techniques. (The problem was to take a list of customer movie ratings and predict how those customers would rate other movies.)

First, you need to clean and normalize the data. The authors were also able to separate the general opinion of each movie from the specific opinion each person had about each movie. (Each of these was about as important to the overall result.) Each of these normalizations or bias removals would likely have been done with some form of machine learning, suggesting that any comprehensive usage would require multiple pipelines or channels, probably directed by some master channels* learning from which of them were the most effective.

I wonder how much of what we do as humans involves breaking down the problem, to divide and conquer. When we’re asked for a movie recommendation, do we think of good movies first, then what that person would think of? Personally, I feel I get my best results when I try to put myself in that person’s shoes, suggesting there may be a long way still to go.

Perhaps looking at groups of movies, or some sort of tagging, to get at whatever ‘genes’ may be underneath, as you may like certain things about movies which are only imperfectly captured by how people like them similarly. (Or perhaps, the data is big enough to capture all of this. It’s fun to speculate. 😀 )

*This suggests a hierarchy, which is only one way of seeing the structure. Other views are possible, but outside the scope.