Category Archives: Machine Learning

Trolley Problem Memes

Trigger warning: Conversation and possibly dark humour about fictional (and possibly not-so-fictional) people dying in car and train accidents.

How do you design a self-driving car to appropriately value human life? Can you use a Facebook group to speed the development of philosophical discourse?

The ‘Trolley Problem‘ is a problem in ethics, first known to be described in its modern form in the early 1950s. Basically, it boils down to the question:

If you have a choice between action and inaction, where both will cause harm, but your action will harm fewer people, is it moral to perform that action?

Interestingly, people answer this question differently, based on how active the action of harm is, the ratio of people hurt between the choices of action and inaction, and other reasons.

The astute will notice that this type of decision problem is a very common one, the most obvious being in military applications, but also vaccines (and invasive health procedures in general), firebreaks, and perhaps the canonical example, automobile design and manufacturing.

This type of decision making has become even more important with the advent of self-driving cars:

Would you drive a car that would choose to drive you into a brick wall rather than run over five pedestrians?

Overall, you would think that this would reduce your risk of fatality, but few people would choose that car, likely because it is a classic prisoner’s dilemma[1][2].

What is your self-driving car's ethical system?
What is your self-driving car’s ethical system?

Personally, I think that much of this conversation is sophistry[3]. If one is truly interested in preserving life, the solution is not to convince self-driving cars to kill different people, but perhaps to have more stringent driving training requirements, to invest in fixing known problem intersections, to invest in better public transit.

So, if these conversations are not useful for anything else, they must be useful in and of themselves, and therefore must be Philosophy[4]!

One of the places that these conversations are occurring is the ‘Trolley Problems Memes Facebook page‘[5].

Now, you can argue that this page is purely for entertainment, but I think there’s a lot more hidden there. There is a fomenting and interchange of ideas, much faster and more fluidly than at any time in history. The person who writes the next book[6] on the ethics of decision making could well be influenced by or be an avid user of a site such as this one.

They may have started with Rick-rolling, but image macros are helping the advancement of human knowledge. Stew on that one for a while.

And while you’re thinking about that, something which ties it all together[7]:

"The creator might argue that his robot is an 'individual', capable of his own decisions, while the opposition would say that he (the creator) is responsible for the algorithm that led to the action. Imagine this happening - it would give birth to one of the greatest on-court debates ever." From Patrice Leiteritz via Trolley Problem Memes
“The creator might argue that his robot is an ‘individual’, capable of his own decisions, while the opposition would say that he (the creator) is responsible for the algorithm that led to the action. Imagine this happening – it would give birth to one of the greatest on-court debates ever.” From Patrice Leiteritz via Trolley Problem Memes

[1]If everyone cooperates, overall they will receive a better result, but if any one of them betrays the others, they get an even better result, but everyone else’s result is much worse. This theoretically leads everyone to betray everyone else, leading to everyone having a worse overall outcome.

[2]People also like the feeling of control.

[3]Check out the article. Apparently, the Sophists were the first (recorded) right-wing think tanks.

[4]My undergrad Philosophy 101 prof. made the argument that because philosophy was not useful for anything else, it must be inherently be useful (and that that was better).

[5]Dark humour. You have been warned.

[6]And it might not even be a book! A blog post, even! 😀

[7]Not a deliberate pun.

Interview Questions: Types of Coding and Algorithm Questions

Part of a continuing series on Interviews and Interview Questions.

Today, we’re going to look at types of coding and algorithm questions. As discussed before, these can be divided up into ‘Problem Solving’ and ‘Knowledge’ questions.

As mentioned before, ‘Knowledge’ questions are very close to ‘human glossary’ questions. ‘What is the Big-O order of QuickSort? Average case? Worst case?’.

But there are some questions which straddle the line between knowledge and problem solving, answers that few but an expert in that topic would be able to exactly recall, like ‘what exactly happens between when you type google.com into your browser and the page appears?’, or ‘compare and contrast various sorting algorithms’.

For those questions, you have to be as widely read as possible, they tend to select for those who are more naturally inquisitive for things outside their specific area of expertise.

Now, for coding questions. There seem to be a few different types, which I’ll try to separate out by data structure[1]:

Arrays and Strings – Any data structure where any element is addressable in O(1) time, where elements are allocated together in memory.

Linked Lists, Stacks, and Queues – Data structures in linear form, where elements far away from the origin are O(N) difficult to access.

Trees – Data structures arranged in a tree form, with a clear root and directionality. Often sorted.

Graphs – Data structures with nodes and edges, where the edges can connect the nodes in arbitrary ways. Home to at least the plurality of the known NP-Complete problems. Note that Graph problems are a superset of the above.

Search and Optimization – Problems where you’re trying to find an optimal (or close to optimal) solution in a large multidimensional tensor or vector field. Many in this category are easily mappable to Graph Theory questions, but many are not, such as 3-D Protein Structure Prediction. Most interviews would likely not ask questions in this category, at least not very complex ones.

Machine Learning and Statistics – Somewhat related to Search and Optimization, problems dealing with how one trains a computer to solve a problem. Likely to become more and more important.

Hashes – Data structures where space is traded for speed. Generally assumed to have 0(1) insertion and retrieval

[1]Hat tip: developer.com

Stupid Hackathon Toronto Ideas

WARNING: SOME OF THE LINKS BELOW MAY LEAD TO NSFW OR TRIGGERING THINGS STOP YOU HAVE BEEN WARNED STOP

Some of you may be familiar with the ‘Stupid Hackathon‘, which I believe was started by Amelia Winger-Bearskin and Sam Lavigne at ITP in New York a few years ago.

(I also know of a San Francisco Stupid Hackathon, hosted by Noisebridge (of course).[1])

Setting aside the issues of privilege and the General Malaise required to make such an event work, I wanted to talk about a similar event happening in Toronto in late May:

http://stupidhacktoronto.com/

The categories (APOLOGIES FOR YELLING STOP THEY WERE ALSO YELLING ON THE SITE STOP):

MARGINALLY IMPROVED FOOD DELIVERY
– Is this the purchasing of food on margin? Speculating on food ‘Futures’? Or ‘Presents’?
– Is this finally the incarnation of AirHamAndCheese.com, the sharing economy startup[2] for fractional sandwich ownership? Only time will tell.
– Is this anything like ‘The Food Lift‘?

REDUCTIONIST BOLTZMANN MACHINES
– How many neurons are required for full reductionism?
– What happens when you only have one neuron? Does it talk to itself?
– If it can talk to itself in multiple ways, is that still turing-complete?
– Do you get one of these by taking the PCA of your Restricted Boltzmann Machine and dropping the 90% least used neurons?

EMOJIANAL INTELLIGENCE
– I think I know what they mean here, and I’m not talking about this topic here.

QUICKTIME FOR PEGASI
– Is this about a phase-cloaking video display?
– Perhaps hacking a 6502-based console to run video?
– Perhaps a squadron flying horses in a hurry?
– Thinking about it, what would you need in video for a flying horse? Some type of HUD? Probably something very light.

MILLENIAL FALCONS
– I was looking at our new condo building, and what looked like a Red-tailed Hawk was perched on top. I hope we can become friends. They can live to 25 years old in the wild, so it might have been a millenial.
– How would you feed a stooping bird? Would you put food out on a flexible holder a few feet out halfway up a very tall building? Gotta practice that stoop somehow…

MAYBE PUT SOME SENSORS ON IT I GUESS CAN I HAVE MONEY NOW
– See ‘The Internet of Thins

VIRTUAL FEALTY
– Cue ‘Second Life’ references.
– You could talk about player organizations within MMORPGs, but what could you build to actually (not) help them?
– This topic is a pyramid scheme.

PENTACOPTERS
– For starfish, of course.
– Or this guy.

A FUCKING FITNESS TRACKER
– I feel like this would require a considerable amount of calibration for each user
– Alternatively, this could be a hide-and-seek game

THE INTERNET OF BEES
– See my post about ‘Beenary’ logic for some ideas on this.

[1]If you’ve never heard of Noisebridge, check out their website! All of the warnings at the top of this post probably apply.

[2]S suggests ‘Sandwich Rental’ for the ultimate experience.

The Bend of Biology and The Spin of Motors

Recently, we talked about how computers win when the rules are fixed, and how humans are better, the more chaotic and flexible the rules are.

So, why is this? M mentioned that as humans, we have a ‘ridiculously powerful feature extraction system that is much more powerful and vastly parallel than any computer’. I’m sure some of this is because we spend years upon years training our brains to be able to recognize a dog from a blueberry muffin. But some of it is probably in the ‘design’.

What most people probably don’t know about computers is that the reason that the chips can be so fast is because of insulation between parts. It’s like how you add brakes to a car so that it can go faster. If you can insulate different parts of a chip from each other, insulate different parts of a computer from each other, using some kind of defined language to communicate between, you can spend all your time independently making each part faster and more efficient. Over (not even that much time), your computers will get (much) faster and faster. So much faster, that they start to overwhelm other designs.

This is similar to how my hard drive (10s of MB/s) is now faster than the CPU on our old 286 (10MHz)[1].

Recent generations of CPUs are designed to be multi-layered, they might have some single digit number of layers of ‘wires’ and ‘transistors'[2], and each of these layers are specifically designed to reduce cross-talk, to be as insulated as they can be from each other.

Contrast this with the brain, which while only running at about 1kHz (vs multiple GHz of CPUs), has massively interconnected neurons, with connections running in all directions, connecting to each other in all kinds of non-binary ways. More complex, not insulated at all[3], chaotic, wonderful, and delightful.

Note: The title refers to how biology is very good at making limbs which bend back and forth, while machines are good at spinning motors.

[1]Yes, I know it’s not an exact comparison, but it’s fun to do anyway.

[2]This is correct enough for this conversation.

[3]Synesthesia is my canonical example.

Computers Win when the Rules are Fixed

One of the important reminders from game four of Alpha Go vs. Lee Sodol was the difference between what computers and humans are each best at.

Traditionally, computers were best at the most repetitive tasks, that were well understood and could be completely described.

If you talk to any release or test engineer, they will tell you that once you can fully describe a process, it’s only a few more steps to be able to automate it.

What makes Machine Learning so tantalizing is that it’s been giving hints of being able to learn to perform not-fully-described tasks, most recently Go.

At the same time, Machine Learning still requires thousands or millions of examples in order to be able to ‘see’ things, whereas humans can understand and make inferences with many fewer examples. It’s unclear to me (and I’m guessing most people) exactly why this is. It’s like there’s something about the way we learn things which helps us learn other things.

But back to the topic at hand. What game four showed us (yet again) is that the better defined the problem, the better humans perform vs. computers.

A different example of this is how high paid market research analysts are being replaced by automation, doing in minutes what would take the analysts days.

So, how do you stay relevant as things become more and more automated and automateable?

As Lee Sedol showed, one strategy is to play Calvinball[1]. Find the part of your discipline that is the least defined, and pour yourself into pushing that boundary, leaving defined pieces in your wake[2].

Note: Playing Strategema like Data is another ‘fun’ option[3], but most useful only when playing against a computer opponent, not so much for forging your own path. It consists of playing sub-optimal moves so as to confuse or anger the other player, to thrown them off balance. It is postulated that Deep Blue did this to Kasparov.

[1]Calvinball is a mostly fictional game invented by Bill Watterson for Calvin and Hobbes. The game has only one rule, that it can never be played the same way twice.

[2]Technically, Lee Sedol played a very ‘loose’ game, which was difficult to define, where parts of the board far away from each other were more easily related. You can also use this tactic to find things and do them in a way where humans are better than computers.

[3]We called this ‘victory through annoyance’ during undergrad. It had mixed reviews.

Go and Weaknesses of Decision Trees

Yesterday, we reported that an artificial Go player had defeated one of the top human players for the first time, in a best of five match.

Today, Lee Sedol responded with a ‘consolation win’, to make the score 3-1.

From this analysis of the game, it seems that (at least) two things were at play here (Hat tip PB).

The first is called ‘Manipulation’, which is a technique used to connect otherwise unrelated parts of the board. My understanding of it is that you make two (or more!) separate positions on the board, one which is bad unless you get an extra move, and the other which might allow you to get an extra move. Since the two locations are separate, the player has to have a very specific sense of non-locality in order to be able to play it correctly[1].

To me, this feels like an excellent example of why Go is so difficult to solve computationally, and why there is still much fertile ground here for research.

The second seems to be an instance of what is called the ‘Horizon Effect‘[2]. Simply put, if you only search a possible gameplay tree to a certain depth, any consequences below that depth will be invisible to you. So, if you have a move which seems to be good in the short term, but has terrible consequences down the road, a typical search tree might miss the negative consequences entirely. In this particular case, the supposition is that Sedol’s brilliant move 78 should have triggered a ‘crap, that was a brilliant move, I need to deal with that’, instead of an ‘now all the moves I was thinking of are bad moves, except for this subtree, which seems to be okay as far out as I can see’. The fact that at move 87 AlphaGo finally realized something was very wrong supports this hypothesis.

Is the Horizon effect something you can just throw more machine learning at? Isn’t this what humans do?

[1]Specifically, the idea that two things can be related only by the fact that you can use resources from one to help the other.

[2]One wonders what types of ‘Quiescence Search‘ AlphaGo was using that it missed this.

Beautiful AI and Go

Something monumental happened today. An artificial Go player defeated one of the top human players three times in a row, to win the best of five match

But I want to go back to game two, where Alpha Go played an inhuman and ‘beautiful’ move 37, a ‘very strange move’.

This is what it must be like to have one of your children, or one of your students surpass what you could ever do. You have given them all you can, and they take that and reform it into something beautiful.

They mentioned that Alpha Go plays the entire board at once, and so is more able to see unusual move possibilities like the one above. Fan Hui mentioned that he’s improved (from ranked 633 to in the 300s) as he plays against Alpha Go.

What else can deep learning teach us? What other amazing, inconceivable things will we learn from this new child which is just beginning to flower?

Similar but not Identical

How do you make a a playlist of songs which are similar, but not identical. Ideally, you want to play music that the user is likely to want to listen to*, but you probably don’t want to play the same song, even in different remixes over and over. So, how do you detect similarities, while removing identicals, even when they may not be so identical?

In practice, there is probably a lot of separation between the spike of identical songs and those that are merely similar. You could also use the Web 2.0 crutch of looking at what people searched after other songs, and/or the machine learning approach of trying to put songs after one another and seeing what people skipped or turned to from the suggestions instead.

Similarly**, cleaning data of artifacts is still an open problem. It feels to me like a similar one. You’re trying to remove this *huge* signal which is overwhelming your sensors so you can get at what you actually care about. Assuming both the artifacts and the signal are within your detection limit***, you have to determine the nature of the artifact, both where it is in the signal spectrum, and what axes it spreads through and how. It might also have related harmonics****.

Another related problem is the removal of 60Hz***** noise from all sorts of electronics. I’m not sure what sorts of filters are used, but even band reject filters have non-ideal behaviour, so perhaps smoothing the edges in a known way works better, but this is all speculation. I mostly like using the field around power cords to test oscilloscopes and to get people to think about electric fields.

But back to artifact removal. I don’t have particular insights right now, outside of specific problems spaces. I just think it would be a really cool problem to work on (and one that people work on in a specific way all the time).

*Or perhaps something just similar enough that you’ve been paid enough to play.

**But not identically,

***My favourite procedure/process is the one I learned from an analytical chemist, which is that the signal has to be 3x the noise for you to consider it signal.

****I’m using signal processing as an analogy, but the concept is the same for other artifact removal, just different math.

*****50Hz across the pond