The Conference for Machine Learning Innovation (ML & Voice 2018). December 5 - 7, Berlin.
I had the great pleasure of taking part in MLCon 2018 (The Conference for Machine Learning Innovation. December 5 - 7, 2018. Berlin).
Below you will find impressions from the conference, and links for further reading.
The MLCon 2018 conference was held at the Steigenberger Hotel am Kanzleramt in Berlin, Germany.
Tried to follow as many talks as possible. But, well, these notes are, of course, in
no way, shape or form complete...
Rather, these notes were written on conference nights, as my way of
keeping track of the events that I attended at the conference. And as a way of storing links and references for future reference.
But enough disclaimers, below, you'll find impressions and links from some of the conference talks and seminars,
including links for further reading.
Great stuff indeed. And much (ML & Voice stuff) to look forward to in the coming years!
1. Introduction.
1.1. Page Overview. - Presentations, Keynotes and Workshops.
Below, in section (2 - 4), you will find impressions and links from presentations and keynotes that I followed Wednesday and Thursday. As well as the workshop that I followed Friday.
Please notice: These notes don't do justice to
the often brilliant presentations that initiated them! So, please read
the original presentations to avoid any distortions ...
2. Impressions from Wednesday, December 5th.
2.1. Look who's talking. A case for branded personas in connected devices.
Wally Brill is Head Of Conversation Design Advocacy & Education at Google.
According to Brill:
I'll admit it: I love talking with robots and teaching them to talk back. It's been that way since I was a kid chatting to my toys with sadly no response.
Since then, and for 20 years I've taught robots all over the World to carry on natural conversations for the Fortune 500 and their customers.
And, sure, he is probably right:
A revolution might be underway. As customer contact through Amazon Alexa and Google Assistant
becomes ubiquitous, and enterprises create their own voice apps and chatbots, the
way we deal with computers will obviously change dramatically.
Interestingly, Brill already took us even further, where voice helps
ensure brand identity, and help enterprises to use voice as valuable new channels for direct connection with consumers.
One important thing to notice here, is that all voices have persona.
There is no such thing as a no-persona, when it comes to voice.
And voices are personal things. Where human will compare themselves to the voice, asking
themselves: ''What is the relationship between me and the voice?''.
Does the voice belong to a ''solid expert'', or does the voice belong
to a ''hip friend'' that has just discovered some facts that he or she wants to share casually with us.
Should apps change their approach/voice according to who they are talking to?
Well, they probably will...
A very interesting talk, indeed!
2.2. AI for business - Experiences and Benchmarks.
Up next was Ulrich Bodenhausen with a talk about how to introduce AI (Here: Machine Learning)
in an organization.
According to Bodenhausen, priority A is collecting data!
There is no data like more data!
As an AI coach he then suggested that the organizations starts training a core team.
Explore ideas, and eventually select some business cases for more advanced development.
In the idea phase it might be useful to know what the companys competition is
doing. And it is probably also a good idea to give people time to explore,
it usually pays back...
One shouldn't start out too ambitious. Starting out small, and then
growing the solution is usually better.
And, you shouldn't wait for perfection.
Put it into production, and get more data to work with, is usually a better idea.
Indeed, it all sounded very reasonable, indeed!
2.3. Building smarter apps with Machine Learning. From Magic to Reality.
Laurent Picard followed with
an inspired talk about smarter apps.
All conferences are, of course, all about human learning.
During a conference, participants are asked to extract useful data from floads of incoming data-traffic.
Obviously, not easy.
But in the end it is, of course, all worth it. As it let us make all of these
cool things that our civilization is all about.
And, here, Laurent Picard would show us some pretty neat new tricks, as part of his presentation.
Being a Machine-Learning conference the new tricks were, of course, ML-tricks...
Where we looked at different examples of ''Ready-to-use Machine Learning Models'' (services), and how to use
them in (our) apps.
Ready-to-use Machine Learning Models:
Image
Video
Text
Text
Speech
Text
Cloud Vision
Cloud Video Intelligence
Cloud Natural Language
Cloud Translation
Cloud Speech-To-Text
Cloud Text-To-Speech
Info
Info
Info
Translation
Text
Speech
2.3.1. Cloud Vision.
First up were some ''magic'' tricks using Google Cloud Vision.
Following Picards instructions, I tried it with a selfie (of myself), and got:
That it is rather unlikely that I show joy, sorrow, anger or surprise (in the photo)...
And it is apparently pretty easy for the service to guess that the portrait comes from Linkedin,
or a similar site. And that I have a connection to Aarhus University, and work with IT...
Video (Cloud Video) works pretty much the same way.
And Laurent had a lot of fun adding moustaches to us all, as we used his video service.
Nota bene: For more about Computer Vision take a look here.
And translate text between some 100+ languages (See Googles Cloud Translation Service).
Clearly, there was also a demo of that:
And we could, of course, also play around with Googles Speech Recognition.
All in all, an awesome presentation!
2.4. Making Enterprises Intelligent with Machine Learning.
Sebastian Wieczorek talked about ''Making Enterprises Intelligent with Machine Learning''.
According to Wieczorek, 60 % of all (human) knowledge work will (can be) in the coming years be replaced with technology.
In a process where the ''Intelligent Enterprise'' goes from being a place where
humans define how to do the work, to a place where humans control the workflow, to, finally, a place where humans supervise the automated processes:
Human defines -- > Human controls -- > Human supervises.
Sure, humans will still pick the right algorithms. Just as it will be humans that prepare the
data for the algorithms (i.e., all in all, make the ML pipelines).
It is still not AI's saying ''What should I do today''.
But even these (small) steps will, of course, create an awful lot of change in our societies.
Just as there will be loads of ethical questions coming up as machines control more
and more of our world.
E.g. Take the task of controlling the air pressure in a passenger airplane:
Lower pressure (during flights) makes the plane last longer, while higher pressure make the
human passengers feel better.
Should AI's be allowed to make the decision on how we should set the pressure in a passenger airplane?
And such questions are, of course, only the beginning.
Interestingly, Wieczorek noted that EU GDPR regulations might be good for privacy in Europe, but
ML companies around the world might find such laws a bit annoying, as they are always eager to get their hands on more data,
of course...
In real life situations, a company might have 350.000 incoming questions that
can be broken down to basicly the same 4 questions being asked again and again.
Creating chatbots that can deal with such flood of incoming questions will, of course,
save a lot of time and energy, if the chatbots are ''good enough''.
But, according to Vercauteren:
Making smart chat bots, that really understand what the user means, can be quite time consuming. A smart bot needs to be trained with an extensive set of expressions, and coming up with fifty or a hundred ways to express the same meaning can be hard, especially for people who are not used to it. In order to enhance the user experience for our clients that make use of our chat bot platform, we are currently implementing text generation.
Loads of interesting NLP techniques can used to better understand what the users
are actually saying to the chatbot.
E.g. techniques like Word2vec create word embeddings
that can be used to see if words are similar:
Word2vec takes as its input a large corpus of text and produces a vector space, typically of several hundred dimensions, with each unique word in the corpus being assigned a corresponding vector in the space. Word vectors are positioned in the vector space such that words that share common contexts in the corpus are located in close proximity to one another in the space.
Still, making it all work is, of course, not easy.
But a great talk that made us all wiser about how to get started!
In 2018 we've managed to receive Microsoft's Country Partner Award In Russia in the field of Artificial Intelligence. During the last few years I also had several publications...
On wave-access they write:
An ambitious idea that needs to be brought to life might come to us in the form of a braindump, instead of a business plan. Nevertheless, we'll help you implement your vision.
And probably this was what it was all about - How to help smaller companies (or perhaps even bigger ones)
make some sense of their data, and get started with Machine Learning.
Again, an interesting talk.
2.7. Search: The Silver Bullet of Symbolic AI and the Secret of AlphaGo Zero.
Oliver Zeigermann from Embarc
talked about Search, as The Silver Bullet of Symbolic AI and the Secret of AlphaGo Zero.
Zeigermann writes:
All the AI hype these days is around deep learning, and machines that will eventually get rid of us humans...
However, the ground work of AI, without which no self driving car and no AlphaGo Zero would work, are search algorithms.
Indeed, there is more to AI than ML, as Mat Velloso and others have pointed out (with an ''over-fitting'' joke)...
Clearly, there are still someone who need ''do'' the knowledge representation and more, before we can
start our Machine Learning algorithms...
E.g. imagine that we want robots to run depth-first, breadth-first or a* algorithms in order to
find a way out of a maze. Well, then we clearly need to encode the terrain first:
(Here we use ''R'' for the
position of the Robot, and ''B'' for some blocks in the environment).
Which will, hopefully, in the end, allow the robot to use its algorithms to find its goal.
Some search trees are pretty big though ...(e.g. for games):
Which gives us that ''Full, exhaustive search is mostly just not feasible''.
But we obviously don't want to explore these (enormous) complete searchtrees if
we don't have to.
If we find a part of the tree, where we always win, then we go that way, without exploring the rest of the tree.
Using techniques like:
Where ''Limit in breadth'' using the ''Monte Carlo Game Search'' algorithm would go something like this:
Start from a next move beginning from an initial state.
For each next move.
Play a game to the end using a random set of moves.
Repeat for a number of times.
Count number of times for win, loose, draw.
Choose the move with the best probability for a win.
According to Zeigermann, ''Exploration vs Exploitation'' can probably further guide and improve our search:
Exploitation: Choose moves with high average win ratio.
Exploration: Choose moves with few simulations.
Compare: How would you explore a new city?
If we then further improve by:
Choose child to expand using heuristic of game state (instead of random expansion phase).
Determined by Convolutional Neural Network (ResNet).
Simulation phase by playing against best known heuristic.
CNN trained by which state leads to a win.
Then we are quickly approaching something like the algorithms used by Alpha Go.
For more, see Alpha Go, learning from scratch.
Leading Zeigermann to conclude that: ''Search is omnipresent in AI''.
I.e.
Path finding is dominated by variants of A*.
Chess can be solved using tweaked Alpha-Beta-Search.
Advanced Monte Carlo Methods can be used for many games.
Where awesome programs like Alpha Go uses variants of Monte Carlo methods that also incorporates Deep Learning and CNNs
An awesome talk!
3. Impressions from Thursday, December 6th.
3.1. Text classification and NLP on Star Wars characters.
Natalie Beyer (Lavrio Solutions), started Thursday with a talk about ''Text classification and NLP on Star Wars characters''.
According to Natalie Beyer:
Text classification can be very important in businesses. Some tasks involve a lot of repetitive, prone to errors processes that could be automated. One of those could be whether certain unstructured text data belongs in which category.
As an example of Natural Language Processing we looked at the characters from StarWars,
and tried to figure out who had said what sentences from the movie.
i.e.
Han Solo: ''Traveling through hyperspace ain't like dusting crops, farm boy''.
Darth Vader: ''The force is strong with this one''.
Obi Wan Kenobi: ''Use the force, Luke''.
But who said:
'' I'm rather embarrassed, General Solo, but it appears that you are to be the main course at a banquet in my honor''.
In the talk, Natalie Beyer showed us misc. useful algorithms, from classical machine learning to neural networks using tensorflow.
We were also given a brief introduction to a couple of popular open source libraries, that can be helpful
when it comes to Natural Language Processing.
E.g. Beyer recommended the Python Open Source Library SpaCy
for text preparation (As preprocessing before Deep Learning with Neural Nets).
And she recommended Siraj Raval's videos about NLP,
as an introduction to NLP and Deep Learning.
It was all very entertaining, great fun and very useful. Indeed, again, an awesome talk!
Btw. It was, of course, C3PO, who said:
'' I'm rather embarrassed, General Solo, but it appears that you are to be the main course at a banquet in my honor''.
More conference impressions... E.g. see ...
3.2. Deep Recommendation Systems for real personalization.
Sk Reddy (Hexagon), talked about ''Deep Recommendation Systems for real personalization''.
According to Reddy, recommender systems are really ''Glorified Search Engines'', that
know what you are looking for before you do...
A system that guide users to useful or interesting objects in a large space of possible objects.
Key concepts: A) Collaborative filtering:
Recommend products based on what similar customers have bought. B) Content filtering:
A systems is content based, when the recommendation is based on content
you have previously bought.
Interestingly, Reddy told us that many systems that ask for our feedback (e.g. Netflix after you have seen
a movie) doesn't really need it. It is only done in order to boost the ego of the user.
The useful info is already logged in the system as ''implicit feedback''.
I.e. How long did you watch the movie? Thats your more accurate feedback!
(Nobody cares about your feedback in the form of questionnaires).
Some of the recommender systems out there are, of course, pretty complicated.
In the talk, Reddy told us a little about the ''Yahoo news'' recommendation system, as
well as the ''Google Play Recommendation System'' (Youtube).
Clearly, such systems have had many iterations through the ML-pipeline, with lots
of analyzing and finetuning of the ML model, before they have reached the current
level of sophistication.
In the Yahoo News system, it is important that we avoid duplication of
recommendation.
We use Bag of Words and
Word Embeddings to see if articles
are similar (and then the system should obviously not recommend articles you have already seen).
The goal is to keep users hanging on, with a high ''click through rate'' (CTR).
Which gives you an opportunity to show more ads to the user.
In the Youtube system, we first have to decide how long you should watch
a video before we can say that you liked it?
And all of the 300 hours of video that are uploaded to Youtube every minute obviously falls
into many categories. So, the recommendation problem is actually a multiclassification problem.
Based on ''watch vectors'' and ''search vectors (for search on Youtbe)'' we then want to train
a neural net that can give us of list of candidate videos, where we can present the newest (i.e. fresh) to the user.
Expect to see recommender systems for a) your next vacation (based on vacation history, and user profile),
b) stock options c) your next job (based on resume and availability) etc. soon ...
The dominant programming language for deep learning is Python. It has a wide variety of frameworks and data scientists love it due to its ecosystem and the workflows it allows. Yet when it comes to actually taking models to production, it is usually met with resistance...
I.e. many enterprise environments uses other languages, such as Java. So, here a thing like the Deeplearning4J library becomes relevant.
Here, the session gave a demo of a Deeplearning4J-based model to be used
in a Java program.
It all looked pretty straight-forward, and easy to implement.
Obviously, a very useful presentation!
3.4. Commodity.ai
Pieter Buteneers, Robovision, talked about ai.now.
Do we want/need Trabrant.ai or Ferrari.ai ?
Here in this era, where AI is thebuzz and disruption word par excellence..
Wisely, Buteneers reminded us, that ML never work perfectly, so you shouldn't oversell it...!
Buteneers suggested that we should ask ourselves ''Who is going to use your idea''?
People should be willing to pay for a minimal feature (that you give them).
You can then work from there. Starting from one small feature.
And then when that is fully incorporated into a product, you can then
ask your customers what they want next.
Sounded like good advice!
3.5. Helping Pacman beat the ghost with deep Q learning.
The agent in state S takes some action which brings the agent to state S' plus gives it a reward R.
At first you might want to play randomly, and give values to your actions as the play progress.
Moving on, he suggested that we used a neural net with some 210 x 160 x 3 colors = 100800 input values.
Rescaled to black/white it comes to 84 x 84 Pixels = 7056 input values - But still a lot of input possibilities.
With 512 neurons it gives 7056 * 512 connections = around 3.6 million parameters, which he then
later rescales to a convolutional neural network with 64 filters, kernel size 5, strides 2 = around 1600
parameters.
Plankton are the diverse collection of organisms that live in large bodies of water and are unable to swim against a current. The individual organisms constituting plankton are called plankters. They provide a crucial source of food to many large aquatic organisms, such as fish and whales.
Here we are interested in classifying the plankton, in order to see what kind of animals are living in the sea.
In order to get started they did a lot preprocessing and data augmentation:
Rescale
Zoom
Rotate
Translate
Flip
Shear
Stretch
(Which created more images to train the deep learning NN on).
Here they wanted to win a Kaggle competition, so they were willing to
spend the extra time to gain that extra 0.5% precision needed to win the competion.
Lots of interesting comments about the process:
They used weight decay that helped avoid overfitting problems. But, interestingly, also helped train the
network faster (Possible answer: If your weights become to big, it becomes more difficult
to correct the network.
Their network trained for 24 hours on state of the art hardware (at the time). They used 215.000
gradient steps to train, and lowered the learning rate after a fixed number of steps.
They experimented with Leaky ReLU activation functions.
And then, of course, they ended up winning the Kaggle competition.
A great talk!
4. Impressions from Friday, December 7th.
4.1. Workshop: Build Your Own Voice Interface with Google Actions.
Followed Jeremy Wilken's (Developer at Wmware)
excellent friday workshop about ''Build Your Own Voice Interface with Google Actions''.
Where we got started on creating our own voice interfaces using the Google Actions platform
(Looking at the technologies involved, and how to plan a voice interaction on the Google platform).
It is expected that voice, in the coming years, will become another surface area for users to interact with products and services.
Eventually becoming the next ''wave'' in IT.
The workshop was a great opportunity to getting started with the Google Voice Actions platform, and learn how it works to provide us with tools needed to build our own conversational interfaces.
All, super interesting.
And, obviously, a great workshop!
Btw. My own first ''pre-conference test'' of the Google Voice platform can be seen here.
5. Conclusion.
The end of a wunderbar conference. With many memorable talks.
Obviously, I'm already looking forward to my next visit to Berlin!