Iotum - Voice 2
Iotum - Voice 2
Iotum - Voice 2
Alec Saunders
It’s already begun. The arrival (and mass adoption) of technologies like Skype, Peerio, and
PhoneGnome are one indicator. Another is the accelerating loss of landline business amongst
incumbent carriers. In the first quarter of this year, North American landline attrition doubled. As
I write this today, landline cancellations have reached 10,000 per day, as customers opt for
cable, mobile, or VoIP solutions over services from traditional providers.
What will that world look like? Who will be the winners, the losers, the moneymakers? What
will the consumer experience be? Follow along with me, and let’s have a closer look. This essay
is part fiction, and part reality. It’s a whole lot of what I would like to see in the communications
platform of the future, which I have dubbed Voice 2.0.
In a typically Scandinavian understatement, Skype founder Niklas Zennstrom’s rationale for why
Skype is so successful was this: "People like to talk". People do like to talk. As communications
services have become ever cheaper, the explosion of usage has been remarkable. For
instance, according to the FCC, between 1994 and 2001, average minutes of wireless usage
per month climbed from 140 minutes per month per user, to 427 minutes per month per user.
Over the same period of time, annual US land line usage grew from 2.8 trillion to 4.8 trillion
minutes. Prices fell, usage skyrocketed.
The merger of talk with the web is the foundation of Voice 2.0. When Skype launched, and the
price of minutes dropped to zero, social barriers to calling strangers disappeared, driving voice
usage higher again. The merger of talk and the web is leading to web based conferencing,
push to talk, application sharing, voice enabled e-commerce, and a multitude of other
applications, all of which are driving voice usage higher. In the process this merger is redefining
the staples of business — customer service, sales, and marketing — and impacting all of our
lives as we move from the standard work day to 24/7 availability.
Talk is the baseline, but that baseline will be combined with text / IM messaging, and video.
Today’s networks can support the technologies. The evolution to full blown, multimedia, real
time communications is just a matter of time. Some products, like the Nokia N90 cellular
telephone, are already providing this capability. Nokia’s E Series telephones will also have built
in SIP clients, facilitating a seamlessly mobile VoIP world.
As speculative fiction writer William Gibson said, "The future is already here, it’s just unevenly
distributed." It begins with talk.
Voice will be free, as the Skyper’s contend, and the Stupid Network model implies. Short term,
all you can eat models, like Vonage, will exist, but long term it’s clear that the metered model is
Currently, the only widespread metered model in VoIP is metered access from the IP network to
the PSTN. But how long before the majority of customers are on the IP network, and the model
reverses? When will we see PSTN customers pay a premium to contact their friends on VoIP
networks?
In the Voice 2.0 world, there will be three billable entities: connectivity, directory, and
applications. Connectivity and directory will be low margin, commodity businesses. Customers
will pay for access to the network, and perhaps to be listed in a directory. Applications will be
the value creators. The meter is off.
If talk is the baseline, then applications will be where value is created. Applications in the Voice
2.0 world will come in three flavours:
• Voice applications - these are the traditional voice applications we already know and
love. IVR, unified messaging, conferencing, and perhaps others. Sophisticated new
tools, such as VoiceXML, will provide ways to create richer and more powerful voice
applications, and drive further value.
• Voice enabled IT applications - these are intersection of today’s business process
automation tools with voice. Sales force automation, CRM, accounting, email, payroll,
etc. Every application we use today will become communications enabled with voice.
Next generation softphones will evolve from simple replacement systems for road
warriors into full blown platform components for knowledge worker applications.
• The voice web - the mash-up of voice and the internet will result in a whole new class of
applications, not seen yet. VoiceXML is the first step in this direction. Tools like
PHPVoice take this a step further, allowing for sophisticated voice applications to be built
using the same kinds of scripting tools that the web is built from. However, the real pot
of gold is the combination of web services and voice. Examples include: spoken word
real estate descriptions from the MLS coupled with mapping, voice enabled matchmaker
services, customer service coupled with inventory / ordering / availability. The mix of
text, web, voice, and programmatic access to data is a heady brew.
The construction and sale of these applications will be a market bigger than the web itself.
Applications are where the value is created in Voice 2.0.
Much of the talk in VoIP is about new kinds of media — video, wideband voice, IM. Media may
be the star of the new network, but the workhorses of Voice 2.0 applications are signalling and
control components. Presence, to determine availability; directory, to determine addressing and
routing; and XML web services for call control, and integration with computing assets. These
are the true value creation components in the architecture.
Presence will drive a fundamental change in the way that communications networks are used
today. Today, callers have no way of knowing whether the party being called is available, or
Directories have existed since the advent of voice networks. However, in the Voice 2.0 world,
individuals own their own directory listings. What you wish to list in your directory listing,
including the fundamentals of name, address, and contact point(s), is your business. It’s your
identity, and you get to manage it — not the carrier. Directories can be extended to include the
idea of persona’s (work, home, leisure), interests, and a myriad of other kinds of personal
information. Directories also become repositories for subscriber preferences, credentials, social
networking details and potentially even financial information for voice enabled transactions. In
the voice 2.0 world, the directory is an opt-in enabler for applications, commerce, and identity.
And last, but probably not least, are XML based web services APIs for accessing presence,
call control, and directory. Scriptable building blocks have been part of IT for over a decade.
One of the primary successes of Web 2.0 has been the extension of scripting to the internet,
with XML based APIs used to create mash-ups between applications. Scriptable
communications building blocks are the next step. In the voice 2.0 world any application, within
the bounds of permissions set by the subscriber, can access presence; initiate, accept, and
redirect calls; and query directories. Experiments today, like Free World Dial-Up’s exposure of
CPL and CCXML, and Skype’s partnerships with VoiceXML providers, are pointing the way.
The business implications of creating an open, programmable web services architecture for
voice are profound. When service provider architectures are open, the envelope — the monthly
envelope that includes the integrated bill for all the services you buy from your service provider
— disappears. Where does a service providers network begin and end in this world? How do
you monetize the component building blocks? The applications? Should you limit access to the
building blocks or make them entirely open?
Will subscribers pay for the platform building blocks? No. Platform components, in and of
themselves, are not interesting to consumers. Applications are. Revenue from the applications
must be shared with building block owners in order to facilitate growth of the platform. The
business model is built on settling transactions at the interfaces between applications and
building block owners, rather than the interfaces between networks.
Fundamentally, this turns the service provider value network on it’s head. In today’s world, the
network operator aggregates services from a number of vendors, and then delivers them to the
customer. Tomorrow, the customer will buy the services they want from whomever they want,
and the service provider will deliver a portion of that revenue to the owner of the platform
component. Voice will be monetized through the long tail of high value applications targeted at
specific communities of users.
In the Voice 2.0 world, applications are ascendant, and platform and network components are
buried in the applications.
Voice 2.0 is a user-centric view of the world. In Voice 2.0, "it’s all about me" — my applications,
my identity, my availability. Voice 2.0 is all about developers too — the companies that exploit
the platform assets of identity, presence, and call control. It’s not about the network anymore.
All of the technical underpinnings I’ve described so far exist today. One element still missing is
a common, standardized, presentation layer. Standards exist for this layer — VoiceXML, SIP,
SALT etc all read on presentation. However, at this point there is no ubiquitous equivalent of
the HTML browser. The closest yet are Skype (completely proprietary) and Gizmo Project (well
below ubiquity, but very complete standards iplementation). It’s most likely that one of the VoIM
players, (Microsoft, Yahoo, Google, AOL) will drive this.
And what of the PSTN stalwarts, the incumbents? For some, their time is done. Others, such
as Bell Canada, have made consistent and intelligent investments in next generation
technologies. The challenge for those will be The Innovator’s Dilemma — how to transition into
this new world, while maintaining profitability, and retaining shareholder loyalty. It will take a
deft hand, indeed.
When I wrote the Voice 2.0 Manifesto about a year ago, it envisioned an application-centric
communications world; a marriage of telephony and the web, leading to new models for
communication, and new business models for service providers. Users would regain control,
and there would be a richness of services available that simply doesn’t and can’t exist under
today’s telecom oligarchs.
The customer experience predicted by the Voice 2.0 Manifesto is not of a single carrier, but
rather of three classes of entities – access, directory, and applications. As a customer, you’ll
pay to be part of the network, you may pay for an identity (and this is an idea who’s time will
come, although it’s hard to see today), and you’ll pay for applications that that help you
communicate in a diverse number of ways. This is a very different model from the traditional,
vertically integrated, communications network.
As a vendor of each of these services, you may have multiple relationships with other providers
of other services. You may provide wholesale services from some vendors, and retail your
services direct to the customer. Or, you may be the wholesaler, selling through another
vendor. Or, you may leave that choice to the consumer. Services may be offered bundled, or
unbundled. It really doesn’t matter.
Twelve months later, that vision of Voice 2.0 is as vibrant as ever, and some of the things which
the Manifesto predicted are already here.
Today, for instance, you can buy originations and terminations from any number of service
providers, ranging from the local guy I use (Unlimitel, here in Canada) to Nufone (the guy I use
to terminate my US calls), Level 3, XO, Worldcom, and many many others. Some vendors only
sell wholesale; others will sell retail. You may obtain identity services from MSN, Yahoo, AOL,
Unsurprisingly, the biggest stumbling block to the Voice 2.0 vision is the incumbent service
provider. Not only do these folks move at glacial speeds, but most regard the Voice 2.0 model
as a threat, rather than an opportunity. Today’s service provider aggregates identity, access,
and applications and presents a single bill to the customer. Today’s billing mechanism is tied to
the metering of access to the network, which is the service providers physical asset. The
service provider is in control.
In a Voice 2.0 world, control is with the customer. I buy the services I want, from whom I want.
It’s a competitive market! In the Voice 2.0 world, the value stack, if you want, inverts. Where the
primary means of monetizing network assets in the incumbent’s model was a toll on voice, in
the Voice 2.0 model, the primary means is the application. The access network provider, and
the identity provider, are selling their services to the applications provider. You, as the
consumer, choose the applications you want, not just those provided by the company that
delivers you access. Voice 2.0 is about you and me, not the network. For an incumbent, that’s a
scary idea.
This world isn’t going to all come about at once. At VON this past fall, Jeff Pulver took us to
task on the blogger’s panel for believing that tolls would ever disappear. He may be
right. Myself, I believe tolls will go the way of the dinosaur, but it will take time, and arguably
today tolls aren’t an issue anymore. As many folks have pointed out this past week
while critiquing Jajah and Rebtel’s respective models, in a world where voice is already cheap
who cares if it’s a little cheaper? On many networks, voice is already cheap, but it’s not yet
cheap everywhere.
Perhaps the most important step forward on the road to Voice 2.0 this year was AOL’s very
gutsy developer play. By opening their network to third party applications, AOL has said to
developers ”share your applications with us, and we’ll share our access and identity
assets.” Unlike the Skype or SipPhone developer programs, AOL has given developers access
to the network itself. Although developers have called for a “naked” Skype, none has been
forthcoming, which makes AOL’s move far more significant. SipPhone’s initiative appears to be
a clone of the Skype initiative, but addressing a smaller customer base. And, unlike Skype or
SipPhone, AOL has a shared revenue model, which insures that the applications developer and
Think about that for a minute. As a developer of applications, you do not need to source
terminations and originations. AOL has done that for you. You do not need to build the network
infrastructure. AOL has also done that. All you need do is focus on your application.
We need more AOL’s. There have been attempts, but they have fallen short, or been
terminated. Google’s model is great, but provides only a core identity and presence service.
TelTel’s ISIPTN initiative held lots of promise, but ultimately died for lack of
management support. We need many more service providers to open their networks, and offer
those services to developers as platforms.
Speaking from personal experience, it was this lack of platform that caused iotum to focus so
much attention on Asterisk, and subsequently AIM Phoneline. These platforms presented an
opportunity to reach the customer, without having to deal with ILEC business processes. We
run a large hosted service which can attach to any network the customer wishes, via an XML-
RPC or SOAP interface. We use the identity and presence information you have already by
interfacing with AIM and MSN today, and tomorrow any IM client you wish. We simply provide
the applications layer. Until recently, this pure Voice 2.0 approach has been way ahead of the
market.
So, it’s easy to sympathize with both Ken Camp’s frustration that GrandCentral requires him to
have yet another telephone number, and Craig Walker’s decision to provision new numbers for
GrandCentral customers. Ken’s right — none of us need another phone number. But the
infrastructure that a Voice 2.0 player like Craig needs in order to deliver his application to his
customer does not yet exist, except at AOL. In all likelihood, GrandCentral was in the latter
stages of delivering their application when the AOL program was announced as well. They
were probably already locked and loaded on their plan to provide new numbers.
Similarly, one of the reasons I am so bullish on Jajah (and there may be reason to be bullish on
RebTel too, but I don’t have enough knowledge of their business plan to comment intelligently),
is that Jajah articulates a similar platform vision. Spend a little time chatting with Roman
Scharf. You will learn that they intend to provide web widgets for a variety of presence and
click-to-call applications. You will learn that their model is to build an ecosystem of partners all
delivering services through their network. Who cares if Jajah is a minute-stealing play today?
Where will they go with it?
Jajah, and AOL aspire to the same vision. They’re both seeking to build a large community of
users, and to expose that community of users to third party developers. That’s so very
Voice 2.0!! Jajah’s recent mobile announcement is just one more step on that path. They’re
going to grow their community of users dramatically with this step, in part because the minutes
are very cheap (except, perhaps to American customers who are already accustomed to cheap
minutes), and in part because it’s completely transparent to use the service.
So, to Ted Wallingford I say “Rome wasn’t built in a day”. These are all concrete steps to
getting to Voice 2.0. And to Ken Camp, “you’re right, it’s about the customer!”.
Have patience, we’ll get there. It’s a big vision, and it will take time to be realized fully.
That’s Voice 2.0! I didn’t call it by that name, but the vision has remained the same.
Ultimately, at iotum, we chose to focus only on the iotum Relevance Engine, concluding that the
vision of Voice 2.0 was too large for us to execute ourselves. As many potential investors told
us, it’s hard to boil the ocean when all you’ve got is an electric kettle. And none of them were
willing to provide us with anything more than an immersion heater… Two years later, in October
of 2005, I wrote the Voice 2.0 Manifesto in an attempt to articulate that vision for our
industry. My hope was that you might believe the same things that I believed, and that by
enrolling a multitude of others in that vision, as an industry we could move forward to a common
goal.
What I wrote about in October 2003 is finally happening today. It’s happening because Jajah,
Rebtel, AOL, GrandCentral, iotum, hullo, Jangl, and a slew of other great companies are
focused on building the next generation communications experience. It’s happening because
AOL, Microsoft, Google, Skype and Yahoo are opening their presence infrastructure. It’s
happening because you and I can go and buy our originations and terminations from my friend
Stephan Monette at Unlimitel, or from Alan Noorda at Nufone, or anyone else that we like. We
all, finally, are beginning to have choices.
For me, and I hope for you too, that’s cause for tremendous optimism.