So, I was talking with A earlier this week about meetings, and she mentioned the issues that many people have with conference calls.
But what are those issues? I can only talk about issues that I’ve had with conference calls.
For those who are not familiar, we’ll start with audio conference calls.
A humorous video by Tripp & Tyler may help illustrate.
To me, these problems can be broken down into the following categories:
– Absence of body language
– Outside distractions
– Other audio artifacts of VOIP
– Technical issues with audio conferencing software
We’ll start with the Technical factors.
Lag feels like it’s only gotten more prevalent with greater use of mobile phones and VOIP. Of the two components of lag (encoding/decoding time and routing/travel time), you can probably improve routing/travel time the most by spending more money on better dedicated VOIP connections. You may also get some mileage from having your conferences during off-peak hours and being on a wired (rather than wireless) connection. The anti-jitter algorithms described in this ‘how VOIP works’ article inherently have a tradeoff* between jitter/dropout and lag. If you make things easier for them, they should be able to improve both for you.
Other audio artifacts of VOIP:
These other audio artifacts are also products of the packet data nature of VOIP. ‘Toilet bowl audio’ is caused by VOIP losing packets and the sound being recreated artificially by the algorithms. (Before they figured this out, you would hear pops or crackles or even more annoying sounds, like in early mp3 encodings.) Sound cutting out is the result of too many consecutive packets being lost.
Feedback is an interesting one. I’ll use the iPhone as an example. When you have the speaker on a device very close to the microphone, you’re liable to get feedback. The device gets around this by analyzing the sounds coming in through the microphone, and ‘subtracting’ them from the output stream. The echoes you may hear sometimes is what happens when this fails. These algorithms were required to make satellite communications viable.
(A better history of echo cancellation is here, for those who are interested: ECHO_history_of_echo_cancellation )
Other audio artifacts of VOIP have similar origins and solutions.
Technical issues with audio conferencing software:
This one still puzzles me. Like microwaves, they seem to be all different, and none of them intuitive. I can’t tell if this is because the industry has not converged on a solution, or the problem is actually that unsolveable. My current favourite is Google hangouts, but that could be because I generally use them for one-on-one conversations. Perhaps this problem is because of the always problematic nature of security, when controlling access of people to be able to phone into a conversation. But even when there is no conference call security, there are still always issues, with people trying to call into the conference, calling the wrong way. I feel like the solution is to have an intuitive interface, where you can see all the calls coming in to your phone and then drag them together to make a conference.
We could even make a game of this. We could call it ’21st century switchboard operator**’.
Now, on to human factors.
Absence of body language:
This is a tough one. Interestingly, humans figured out a method for showing body language in text at most 11 years after the first email*** was sent, while 140 years after the invention of long-range audio communication****, we still do not have an effective method of conveying body language over audio transmissions. I would say that video conferencing will supplant all audio within our lifetime, but there is still space communication, satellite communication, dark rooms, non-working cameras, etc… Perhaps some sort of interstitial click language would work.
In the meantime, the best solution is to have people meet each other, in person if possible, over video or at least a one-on-one audio before they engage in a conference together.
The other elephant in the room is people who are normally bad at in-person body language cues for when it is time for them to finish talking. In person, this can be difficult, even with a strong moderator. In an audio conference, this can be well nigh impossible. A moderator with the ability to selectively mute participants might work. The social hierarchies in many organizations may not permit this, but improving the flexibility of those hierarchies and teaching people to *listen* is one of the key components of Agile.
This one feels like a tossup between having a strong moderator and having an engaged workforce. Sometimes life does indeed intrude into work, but if this is occurring on a regular basis, perhaps it’s an indication that the meeting is at the wrong time, or too long, too low a priority, or the participants are not as engaged as they could be, for whatever reason. Addressing those issues is probably the best next step here.
So, that was a lot of words. Apparently I have a lot of thoughts about this. If you want more, comment below!
Some other useful links:
An explanation of packet loss and discards.
*Because you’re sending voice data in packets, these packets have to be reassembled at the other end. Because the packets are going over the internet, they can be delayed. A delayed packet either has to be left out or waited for. This causes jitter and lag, respectively. If you have a better connection, the algorithms can make better decisions for you.
**This just makes me appreciate the people who did this job even more, and I always thought it was difficult.
***The article also goes in depth about the specific strengths of email, and how it may be a more natural method of communication for humans than some other types…
****I did not know before reading this article that Alexander Graham Bell was “Professor of Vocal Physiology at Boston University [and] engaged in training teachers in the art of instructing deaf mutes how to speak”