Category Archives: Science!

How Deep do you Present?

When you are giving a presentation, there are a number of decisions you have to make. How many words to put on each slide[1], what colour to make the slides[2], what you’re going to talk about[3], and many others.

Today, I want to focus on how you plan your presentation so as best to deal with ‘why did you?’ type questions. This is most helpful when you’re giving academic presentations, where you will likely have multiple people in the audience who actually know more[4] than you do about parts of what you’re talking about.

When you’re planning a presentation, it’s often tempting while you’re doing a survey of the field to go into an equal amount of depth all across the field, no matter how much you actually know about the field. This may be slightly better for the audience, but it means that in some parts of your presentation, you will not be able to answer even one ‘why?’ question[5].

It is better to decide on how many ‘why’ questions you want to be able to answer, then you can design your presentation so that there is always that amount of space between what you are presenting and your knowledge. You will be better able to serve your audience by being able to answer a reasonable depth of question, and you’re much less likely to embarrass yourself.

[1]None, if possible.
[2]Whatever helps keep the audience awake, I tend to use black on white for this reason.
[3]I recommend Keybeards and Bagpopes.
[4]Not to be confused with people who have the delightful combination of liking to hear themselves speak and the urge to tear others down while not really knowing much about the topic at hand. Sometimes this is a fine line.
[5]Cf. ‘Five Whys‘.

The Bend of Biology and The Spin of Motors

Recently, we talked about how computers win when the rules are fixed, and how humans are better, the more chaotic and flexible the rules are.

So, why is this? M mentioned that as humans, we have a ‘ridiculously powerful feature extraction system that is much more powerful and vastly parallel than any computer’. I’m sure some of this is because we spend years upon years training our brains to be able to recognize a dog from a blueberry muffin. But some of it is probably in the ‘design’.

What most people probably don’t know about computers is that the reason that the chips can be so fast is because of insulation between parts. It’s like how you add brakes to a car so that it can go faster. If you can insulate different parts of a chip from each other, insulate different parts of a computer from each other, using some kind of defined language to communicate between, you can spend all your time independently making each part faster and more efficient. Over (not even that much time), your computers will get (much) faster and faster. So much faster, that they start to overwhelm other designs.

This is similar to how my hard drive (10s of MB/s) is now faster than the CPU on our old 286 (10MHz)[1].

Recent generations of CPUs are designed to be multi-layered, they might have some single digit number of layers of ‘wires’ and ‘transistors'[2], and each of these layers are specifically designed to reduce cross-talk, to be as insulated as they can be from each other.

Contrast this with the brain, which while only running at about 1kHz (vs multiple GHz of CPUs), has massively interconnected neurons, with connections running in all directions, connecting to each other in all kinds of non-binary ways. More complex, not insulated at all[3], chaotic, wonderful, and delightful.

Note: The title refers to how biology is very good at making limbs which bend back and forth, while machines are good at spinning motors.

[1]Yes, I know it’s not an exact comparison, but it’s fun to do anyway.

[2]This is correct enough for this conversation.

[3]Synesthesia is my canonical example.

Shoddy Preprints vs. Agile Biology Development

Early Access to Raw Scientific Results or Shoddy Preprints? Agile Biology Development or Reckless Endangerment?

Today, I read a post that L made on fb today about the issue of preprints in various bio-related fields. The worry is that people will preprint shoddy work online to get priority[0], followed by revising or ‘correction’ for publication.

If you’ve been reading this blog (or just the ‘Agile’ category) for a while, you’ll probably know that I am generally in favour of agile as well as Agile practices. My view is that the more communication and more frequent communication (up to a point)[1] you have between participants (in this case the scientific community), the more useful and better aligned the overall product will be with whatever the goal might or should be[2]. This means people can build on each others’ work more easily and quickly.

With code, it’s pretty easy to build on something someone else has done. A well-written set of unit tests will make sure that goes mostly smoothly. But how do you do this with research without the peer-review?

You can think of peer-review as the testing and release process for a minimum viable product of research, most commonly released as a scientific paper. But papers can take months to write, to go through review, to be published.

So, you have a huge body of researchers working on similar things, but only sharing notes every year or so[3].

So, you could have them send their raw results (untested code) around to each other as soon as they’ve acquired the data[4]. Currently, this is done in small groups of friends or collaborators, if that. What if they posted their raw results, and anyone in the world could download and comment[5]? As things became more refined, or others added their agreeing or contradictory results, the community as a whole could very quickly zero in on what was actually going on.

You would also have all the documentation you needed to show who had priority, and all of whom had contributed along the way. We would probably need to rethink a bit how we gave credit, as the above method could easily replace a lot of scientific publishing.

We would also have to rethink how we gave credit for careful work, as the above system would tend to reward quick work over careful work. But social media can probably show us the way here, with different researchers having some type of time-delayed ratings for how often their results are ‘accurate enough’.

Science may progress faster, and it would be difficult to grind up more grad. students than it does right now. Being part of a huge community who cared might help grad. students (and post-docs) a lot more than you might think.

I wanted to close with an example you’re probably heard of which may help illustrate how this might work:

You’re probably familiar with Watson and Crick, and their work uncovering the Double Helix of DNA. You may not know that the X-ray structure photo which confirmed the theory that DNA was a double helix was made by Raymond Gosling, under the supervision of Rosalind Franklin.

What happened was Gosling returned to his former supervisor, Maurice Wilkins, who showed the photo to Watson and Crick without Franklin’s knowledge or consent. They proceeded to publish their famous ‘double helix’ paper with a footnote acknowledging “having been stimulated by a general knowledge of” Franklin and Wilkins’ “unpublished” contribution[6], followed by Wilkins’ and Franklin’s papers[7].

Note also that all three of these papers appeared with no peer review, and Wilkins’ boss went to the same gentleman’s club as one of the editors of Nature.

So, if we’d had instant world-sharing of preliminary results, Gosling would have posted his photos. Most people would not recognize the significance. Pauling and Corey, Watson and Crick would have all jumped on it. Franklin might have been persuaded to comment on what she thought before she was 100% sure. Wilkins might have come out of his shell sooner[8].

Science would have been done faster. More credit would have gone to the people who did the work. More credit would have been spread around to the people thinking about all of this. More of the conversation would be out in the open.

Science would have been done faster. Science might have been done better.

[0]”Who gets credit?” So important in a ‘publish or perish’ culture, but also important for the history books. The example below (above?) may elucidate some of these issues.

[1]I think most people top out at about once per day, but on a well-functioning team, on some types of tasks, this can be every few minutes, or seconds.

[2]Yes, there are arguments here about how some researchers should be left alone to do their work, because they’re working on things everyone else thinks are silly or wrong. They are outside the scope, and I don’t see them being as affected by preprints, which are much more likely to be an issue in extremely competitive fields. I suspect most researchers, like most writers and musicians, probably like most people, would be happy to have other people paying attention and caring about what they do.

[3]I use 1 year because it’s a nice round number, and because about 41% of scientific papers have an author on it who publishes a paper once a year or more.

[4]Or first draft…This will likely take some back and forth to discover the best use of peoples’ time.

[5]Note that this is basically what the genome sequencing centers do, and that project seems to be going reasonably well.

[6]The linked text is a direct quote from the Wikipedia article, which has two level of quoting inside.

[7]Franklin’s paper was only included after she petitioned for its inclusion.

[8]The backstory on this is fascinating. The linked articles are probably a good start, but I’m guessing many books have been written on this. Teasing apart what actually happened 60 years later is nontrivial.

Beenary

It was never really taken seriously. It was most often expressed as a joke:

Q: What type of logic do hive dwellers use?

A: Beenary logic!

And this was true, to an extent. Bees did in fact use beenary logic. But like their honeycombs, it was a hexary, or six-valued logic system. As part of the ‘hive mind[1]’, they would dance in one of six directions for each hat[2] of information conveyed.

Most bee historians had indeed converged on the conclusion that bees were the true inventors of hexary logic, and were the first to answer yes or no questions in one of six ways.

So it was for this reason that ‘beenary trees’ had six children for each node, that a ‘beenary search’ would involve a bee making a ‘bee line’ out from a central hex, and ‘beenary star systems’ were much more complex.

Also, in their preferred computer language, the conditional operator was ‘Bees?’.

cah-bees

[1]Scuttlebutt has it that the bees always hated the term ‘hive mind’, both because “Yeah, we live in a hive, and we have minds. What of it?”, and because it was mistakenly applied to other colony forming insects.

[2]Binary uses ‘bits’ of information, the natural log uses ‘nats’ of information. Ergo…

Friendly Triangles and Spectator Ions

There are many different ways that you learn things. You can learn things from school, from books, from videos, from sticking a fork in a light socket.

But we’re talking about the things you learn in passing, or by osmosis, as you’re growing up. Sometimes these are things learned so early on in your education, so basic, and built upon by thousands of other concepts. Sometimes they are the ways of speaking of your parents, their ways of thinking.

For me, this was Spectator Ions. Growing up, my dad would always talk about (aqueous) chemical reactions, for example, from Wikipedia:

2Na+(aq) + CO3 2−(aq) + Cu 2+(aq) + SO4 2−(aq) → 2Na+(aq) + SO4 2−(aq) + CuCO3 (s)

In this reaction, the carbonate anion is reacting/bonding with the copper cation. The two sodium cations and the sulfate anion have no part in this reaction. They are merely ‘spectators’.

So this is all reasonable, this makes sense. But I was trying to explain this to someone recently, and I realized that I didn’t know the phrase ‘spectator ions’, I just knew intuitively that sodium cations are basically never involved in reactions. The best way I can describe is knowing them as ‘small and bouncy’. (Perhaps ‘small, bouncy, and indivisible’, unlike N2(g), which is ‘small to medium-sized, bouncy, and divisible with significant effort.)

So, how do you explain something like this, when you approach it in such an intuitive way? I feel like it approaches or becomes an issue of privilege, like being the only person who can access the underpinnings of the system.

Sometimes, I feel the same way about ‘friendly triangles’. Probably the most famous of these is the ‘3,4,5’ triangle, which has been known (and presumably used in construction) since antiquity.

The other triangles commonly called ‘friendly’ are:
– 1,1,sqrt(2), or the ‘45,45,90’ triangle, used with unit vectors everywhere, also interestingly the right-angle triangle which has the largest percentage of its perimeter in its hypotenseuse.
– 1,2,sqrt(3), or the ‘30,60,90’ triangle, used most often probably with equilateral triangles and subsections thereof

Once these concepts are automatic, you start to see them everywhere. If you want a better explanation of ‘friendly triangles’, try here:

http://www.purplemath.com/modules/trig.htm

But back to our original question, which was all about how you deal with having a very intuitive sense of something, which underpins your world view in a subtle but fundamental way that is difficult to describe. I don’t know. All I can do is to try to notice when it happens, and try to learn how to best describe it, which is really all you can do to try to communicate something unconscious to you and which may be outside the other person’s experience. I think a later post will talk about some of my other interactions with math of this type, and how I learned to describe while showing and sharing.

Waving Shipfish

The waves existed, as they always had. Well, as they assumed they did. There was not much memory in waves. Every so often, they would etch some comments onto shore rocks, or read comments from before. These comments were all-too-transitory for the waves, as they would inevitably erode them away all too soon. There were also the old stories kept alive by the deep waves, those of the time before waves, when the waves were rocks and rocks were waves. The old stories also told of times when sky water was different liquids, but those times were long gone.

But something different was happening now. Normally, the waves would be fed by sky water, nurtured by winds, but there were organics coming from above? Organics had not come from above since the sky water was different, and never in sizes larger than droplets. The waves were not concerned, as waves never are. But the waves felt the pain of the shore beasts diving under the waves for protection. At the same time, the underwater beasts seemed almost giddy, swarming to the surface and feeding voraciously everytime the strange organics fell. The fliers would wink in and out, sometimes feeding, sometimes with fire, sometimes evading the sky organics.

Time passed. The waves existed. The organics stopped falling from above. They started again. They stopped. They started again. The waves were no longer visited by large shore beasts. The underwater beasts multiplied and proliferated. The fliers kept flying. The cycle continued. The waves existed. Time passed.

Something changed again. Large beasts from the sky! Some of metal! The waves had new friends! Large water beasts who talked to each other and played with the waves. The land beasts also played with the waves and traveled among the waves in mobile artificial land. As much as waves could feel joy, they felt joy.

The cycle progressed. The sky organics returned. The waves saw less of the beasts. There was less time for play. There was much fire above the waves, much pain from the land beasts. There were different chemicals at play. Runoff from the land beasts now included residue of strong dissolver. The Southern waves stopped seeing the land beasts. They heard word from the Northern waves that land beasts had appeared there and seemed to hide under rock, some artificial, some carved by waves. The waves were happy that their eons old carving work had served some purpose. The waves existed.

The waves existed. Time passed. The large water beasts played with the waves. The larger water beasts went deep under the waves and sang to them. The waves existed. The waves were happy. The waves existed.

The Power of Godzilla

The waves existed, as they have since the Earth had oceans and spun enough to displace them. It has been said that the quest of the waves is to travel all the way around the world, and that erosion of rocks is their slow and patient way to achieve this. Some say that they are opposed by the forces of fire and earth, who combine to make volcanoes, or to move plates, to create mountains and more land. But this is a story of a smaller disturbance…

The surface waves felt a new object coming up from below. The object reached the air, and the waves lapped around it, trying in their patient way to erode it, to continue on their traveling quest. The waves noticed that the object was green, not in the green way of tropical waves, but the green of an algal bloom out of balance. It had spikes, a crest, but the waves could no longer crest it, as it was rising further out of the water. A round head emerged, eyes open even underwater, showing that this creature was at home in salt water. This creature was larger, much larger than the largest of the underwater singers that the waves loved to listen to as they swam the oceans. The waves especially loved the large underwater singers because they would surface to take air, and sometimes even play with the waves, but that is another story, for now the green creature was emerging from the water.

The head was emerging from the waves. As the eyes passed, the waves saw that they were in pain. The waves did not like seeing creatures in pain. But they did not understand. So they watched, and waited, in their endlessly patient way.

A neck, arms, a torso emerged, then finally legs and a tail. The creature, now walking, still in pain, was walking towards the shore. The waves could see the small hairless creatures fleeing from the green creature. Sirens from the land. Screams of pain from the beast.

Doctor Kayama’s team was ready. They had analyzed all the recordings from the creature, and decoded a language they hoped to use to communicate with the beast. They rushed to the power station nearest the beach, as that was where the beast always attacked. They had increased the defenses and the walls, but it was never enough.

[The pain, the pain, the pain!] the beast cried. [Make it stop!]

It was now or never. Setting the speakers to maximum power, the team roared their own broadcast.

[What is the pain? Why do you always attack us?]

[The pain! The Noise! Why do you make that noise?!?]

[What is the noise?]

But it was too late. The beast had reached the power plant and was destroying the generators, as it had so many times before.

Reduced to battery power, the team only had enough power for a few more words:

[Why do you do this?]

The beast seemed to be calming down, or was it? It turned towards the broadcast, but instead of stomping, it stopped and roared:

[It is your electrons, you make them vibrate at 180 Maakktars. It causes so much pain! Why do you not vibrate them at 216 Maakktars like the other side of the island?]

The team spoke into their translator “What do you mean Maakktars?”, but it was too late. They were out of power.

The beast, seemingly no longer in pain, plodded back to the waves, who were much happier to see it now, as it always wanted to play on the way below the surface.

The beast played with the waves for a time, then started to sink beneath, to return from wherever it came. The legs and tail were the first to sink below the surface, then the torso, the arms, and the head, its eyes now serene.

Last to sink below the waves was the crest. As it sunk below the waves, it created a small whirlpool, which, in time, became waves who re-embarked on their endless quest to travel around the world.

Hat tip: https://www.reddit.com/r/WritingPrompts/comments/40nb1o/wpafter_destroying_tokyo_yet_again_godzilla/

Solution Rotation

So, sometimes when someone asks me a question, I feel like I’m rotating through a number of possible solutions/solution types, like rotating through different options in a leather punch. https://www.google.com/search?tbm=isch&q=leather+punch

I first noticed this in a conversation with Garland Marshall, one of my favourite profs. at WashU: https://biochem.wustl.edu/faculty/faculty/garland-marshall. He’d asked me a question about how one would determine the structure of a binding site of a molecule too difficult to crystallize, too large to NMR, and impossible to get a structure with a bound ligand.

How do you come up with the structure of the binding site? I remember rotating through a number of different options, mostly focused on polling the ligand in various ways.

– Does this happen to other people?
– Is there a neuronal definition/description of this?
– What does this mean?
– Other types of analogies?

On the ‘neuronal pathway’ front, it could be something like activating different pathways in sequence, doing it manually, rather than letting your brain activate all of them at the same time, then aggressively pruning them (to save energy). So, you would actively control your thoughts, to try out each channel independently, and submit them to more rigorous logic, to make sure you hadn’t left anything out. Somewhat like taking the ‘mental shackles off’, asking an audience ‘what ideas would your most creative and silly friend have about what to do with a brick?’, rather than ‘what ideas would you have about what to do with a brick*?’

*This seems to have been adapted from the Torrance Tests of Creative Thinking https://en.wikipedia.org/wiki/Torrance_Tests_of_Creative_Thinking

Problem Solving Examples (With some Machine Learning)

So, in a previous post, (http://nayrb.org/~blog/2015/12/25/automation-and-machine-learning/), we talked about some methods to help you decide whether you actually needed Machine Learning or not to solve your problem. This post talks about some various different problem solving approaches and which types of problems they can make tractable.

I started my career fascinated by protein folding and protein design. By the time I got there, they had narrowed the question down to one of search: ‘Given this physics-based scoring function, how do I find the optimal configuration of this molecule’? There were a number of different techniques they were using: gradient descent, monte carlo, simulated annealing, but they all boiled down to finding the optimal solution to an NP-Complete problem.

As we know that biological systems can perform protein folding quickly, there must be some algorithm which can do this (even if it means simulating each individual electron). This can then be restated as a simulation/decision question, from the perspective of a cell/physics. Many other search problems have similar human-like or physics-like easier solutions (ways of finding the NP-Complete verifier). For example, as a traveling salesperson, you would look at the map, and be able to narrow down the routes to some smaller number, or be able to quickly narrow down the options to a small number of sets of routes.

In many ways, this is the ‘holy grail’ of Machine Learning, the ability for a machine to step away from what we tell it, and to be able to solve the problem in a more direct way. Heuristics are an attempt to solve this problem, but they’re always somewhat rules-based.

Next is clustering, best used for differentiating between different groups of things so that you can make a decision. My favourite is ‘Flow Cytometry’ https://en.wikipedia.org/wiki/Flow_cytometry, where you’re trying to differentiate different groups of cells, basically through clustering on a 2-D graph of the brightness of various fluorescent cell markers.

Customer persona clustering is another example, such as you might do for segmentation, where standard groups like age or location would not be good enough.

Machine Learning problems such as the Netflix challenge http://www.netflixprize.com/, where you want a large degree of accuracy in your answer, require the use of a number of techniques. (The problem was to take a list of customer movie ratings and predict how those customers would rate other movies.)

First, you need to clean and normalize the data. The authors were also able to separate the general opinion of each movie from the specific opinion each person had about each movie. (Each of these was about as important to the overall result.) Each of these normalizations or bias removals would likely have been done with some form of machine learning, suggesting that any comprehensive usage would require multiple pipelines or channels, probably directed by some master channels* learning from which of them were the most effective.

I wonder how much of what we do as humans involves breaking down the problem, to divide and conquer. When we’re asked for a movie recommendation, do we think of good movies first, then what that person would think of? Personally, I feel I get my best results when I try to put myself in that person’s shoes, suggesting there may be a long way still to go.

Perhaps looking at groups of movies, or some sort of tagging, to get at whatever ‘genes’ may be underneath, as you may like certain things about movies which are only imperfectly captured by how people like them similarly. (Or perhaps, the data is big enough to capture all of this. It’s fun to speculate. 😀 )

*This suggests a hierarchy, which is only one way of seeing the structure. Other views are possible, but outside the scope.

BOF VI: The Chemist in me:

UPDATE: to address tormuse’s comments below.

This discussion came out of a comment I made about the usage of ‘cis’ to refer to people whose gender identity is more matching with their gender assigned at birth*.

I said:

“The chemist in me is glad that this word is being used in this way. :)”

And DM replied, asking about the ‘difference between Cis–trans isomerism and chirality’.

Not knowing if this was a serious question, and not wanting to derail a trans- issue conversation with chemical pedantry, I’m putting my answer in this blog post:

Don’t know if this is a serious question, but I’ll bite. Cis/Trans isomerism (generally) has to do with two Carbons connected by a double bond, with different things at R1/R2/R3/R4.:

R1     R3
  \   /
   C=C
  /   \
R2     R4

R1     R4
  \   /
   C=C
  /   \
R2     R3

These two molecules have the same chemical composition, but because the C=C bond is rigid and does not rotate, they can have different chemical properties.

Chirality is a little bit more difficult to explain using 2-D ASCII Art, but basically:

 R1
   \
r4--C--R3
   /
 R2

(With r4 at the back, behind the C, with R1/R2/R3 sticking out slightly from the page. #limitationsof2d)

Contrast with:

 R1
   \
r4--C--R2
   /
 R3

Which is a non-superimposable mirror image in 3-D. Even if R1/R2/R3 rotate around the r4 axis, they two molecules will never have exactly the same chemical characteristics (cf. Thalidomide)

*I’m sure I’m using some of these words incorrectly or imprecisely. Please comment to correct me!