Impressions and Links from
ML Prague 2022

Machine Learning Prague 2022.
May 27 - 29, Prague.

I had the great pleasure of taking part in ML Prague 2022 (The practical conference about ML, AI and Deep Learning applications. May 27 - 29, 2022. Prague).

Below you will find impressions from the conference, and links for further reading.

The ML Prague 2022 conference was held in the beautiful city of Prague (Czech Republic).

Tried to follow as many talks as possible. But, well, these notes are, of course, in no way, shape or form complete...
Rather, these notes were written on conference nights, as my way of keeping track of the events that I attended at the conference. And as a way of storing links and references for future reference.

But enough disclaimers, below, you'll find impressions and links from some of the conference talks and seminars, including links for further reading.

Great stuff indeed. And much (ML & AI stuff) to look forward to in the coming years!

1. MlPrague 2022.

MlPrague:The practical conference about ML, AI and Deep Learning applications. May 27 - 29, 2022. Prague.

1.1. Conference Venue.

Used Google Streetview to check out the location of the venues,
before I actually arrived in Prague...

Cevro Institute in Prague, as seen on Google Streetview.

La Fabrika, as seen on Google Streetview.

2.1. Workshops. Friday. May 27th.

Queue in front of the Cevro institute, Friday morning.

2.1.1. Language Model Essentials: Pre-training, Metrics, and Community (Workshop).

The first workshop was ''Language Model Essentials: Pre-training, Metrics, and Community'' by Nick Doiron, HPE.

Using models from HuggingFace, we started by taking a look at some movie review exercises [1], [2]
(Simple Transformer [3] and Starter Adapter [4]).

(Where, more, ''Sentiment Analysis Resources'' can be found e.g. here).

(And) Using models like GPT-2 it becomes possible to do rather advanced things, like completing sentences:

Jamie is a student. Every day she wakes up and goes to ...(?)

Don't board the train without ...(?)

So, the models can be used to things like talking about learned facts, co-write stories and programs (Latitude) and classification (e.g. sentiment analysis).
See MLP GPT example [5] (login on Colab).

And notice that (these models are pretty big...):

GPT-2 Medium is 345 Million params, compared to GPT-2 release (1.5 Billion params) and GPT-3 (175 Billion).

Future models will probably be even more amazing, and do things like
(understand our task with 0-3 examples):

Write a summary.

Complete a Python function.

Label text positive or negative (based on just a few examples).

Didn't quite get around to it, but looked a little at an exercise about ''Locating and Editing Factual Associations in GPT'', just before the workshop ended. Indeed:

How does GPT know that the Eiffel tower is in Paris [6].

Jupyter notebooks, plus more material, can be found on Nick Doirons (Prague workshop) Google drive.
Additional (NLP language model) notebooks from Jay Alammar can be found here.

An awesome workshop, indeed.

2.1.2. Practical aspects of reinforcement learning (Workshop).

The material for the ''Practical aspects of reinforcement learning'' workshop can be found here.
With (from the presenters company Dataclair)
Jupyter notebooks here (Including solutions, e.g. here).

From the presenters introduction to the workshop:

First we review building blocks of the ReInforcement framework (from this workshop) and then walk you through building a custom implementation of those elements which will include a lot of code running on tf.Graph.
Preparing you to start using TF-Agents in your own projects!

All, very helpful in upcoming projects!

Indeed, again, a great workshop!

So, indeed, what a great day it turned out to be in the Cevro Institute.

2.2. Saturday. May 28th.

Impressions from the first conference day.
At the La Fabrika venue. Prague. Saturday, May 28th.

Conference, LaFabrika. MlPrague. Prague 2022.

2.2.1. The high-dimensional geometry of deep neural network loss landscapes.

Back in ''MlPrague 2019'' Tomaso Poggio talked about ''the 3 main theoretical puzzles of Deep Learning'' (See here, Section 4.1).

The next question one could ask oneself (concerning deep learning networks), is why don't the gradient descent methods (for finding the weights of the networks) get stuck in local minima? Interestingly, we were shown how it is possible to show that stochastic gradient descent methods finds (with high probability) the global minimum. I.e. which then assures us that the algorithms that we use actually work ... great... [7].

Picking up Poggio's challenge, Stanislav Fort also touched upon these Deep Learning puzzles. I.e.

Despite their tremendous success, we still do not have a detailed, predictive understanding of how they work and what makes them so effective. In this talk, I will describe recent efforts to understand the structure of deep neural network loss landscapes and how gradient descent navigates them during training.

Many interesting details (in the talk).
And, certainly, important work (See more here).

2.2.2. Seznam search goes semantic.

Jakub Náplava, Seznam.cz talked about ''Semantic search''.

(It is important that we can) Retrieve documents that do not have a textual match with the query but are semantically relevant to it.

The so-called vector branch comprises new vector indices that allow for fast retrieval of tens of thousands of potentially semantically-relevant documents which are further filtered using features computed by more computationally expensive Siamese BERT-based models [8].

The deployment of these new techniques improved overall search quality significantly.

Indeed, interesting ''hands-on'' work, with many interesting details about their work with these techniques (BERT etc.).

2.2.3. Towards human-like synthetic voice.

Petr Fousek, The Mama ai, talked about ''Towards human-like synthetic voice''.

Synthetic speech is on its way to replace humans in all sorts of dialogue systems (in phones, voice services etc).

But (we should remember that) audio is much richer than text. A voice gives us:

Voice: Speaker identity.

Emotion: Sad, angry, neutral, happy.

Style: Expressive, indifferent.

Prosody: Pitch, tempo.

Environment: Background noise, other audio.

Still, many systems have worked on generating audio from acoustic features:

WaveNet (2016) [9].

WaveGlow (2018) [10].

HiFi-GAN (2020) [11].

UnivNET (2021) [12].

Following in these footsteps:

At MAMA.ai we work towards the goal of building customizable voices which would carry personality traits of real humans. We use open-source technologies and data. In the talk we will show where we are, how we build voice models and share the lessons learned on our way.

All, very interesting, indeed.

2.2.4. Lessons learned while training GANs.

Jan Maly, STRV talked about ''Lessons learned while training GANs''.

Generative Adversarial Networks applications have seen astounding growth even though there are still many challenges in training. We will share some generalizable techniques that helped us overcome challenges when separating music tracks from audio recorded during live events, such as sports matches.

There are many types of GANs, e.g.: Conditional GANs, Semi-supervised GANs, InfoGANs, AC-GANs.

And therefore, many different techniques one need to master in order train these different types of GANs.

Still, here it was suggested that there are general things to ''notice'' / ''look into'' when training (GANs) (techniques they have had some success with).
E.g.:

Improving vanilla GANs by adding a condition into the Generator and Discriminator networks.
Natural for many problems, generated examples should match specified properties.
Conditioning usually help with stability, as it adds more structure.

Clearly, one should also try to get the GAN-architecture ''just right''.

For a fitting architecture, they notice that:

Deep Convolution GANs (DCGANs) are now standard.
Avoid Sparse Gradients (- ReLu -> LeakyReLU, MaxPool-> Average Pooling or Conv2d + stride, and Upsampling -> PixelShuffle or ConvTranspose2d + stride. See: ''Tips for Training GANs'').
Experiment with ''battle-tested'' architectues like ResNet. To avoid issues with gradients, use pre-activation.

An awesome talk, indeed!

They have also found that non-standard discriminators have worked well (for them).
Just as they have noticed that ''Relativistic'' cost functions have worked well (''Relativistic GAN, RS-GAN stands out'' [13], [14], [15]).

Interestingly, adding ''attention mechanisms'' is also something they would suggest we should all take a closer look at:

Extending models with an attention mechanism.
Attention allows attention-driven, long range dependency modelling.
The generator uses cues from all feature locations.

Very useful comments.
And, for most of us: A lot of things to test out (now and in the coming years)!

2.2.5. Living in Perfect Harmony.

Yama Anin Aminof, Meta (formerly Facebook), talked about ''Living in Perfect Harmony - Where Music and Machine Learning Meet''.

In this talk, we will dive into the world of song analysis and the extraction of lyrical and musical features.

In this interesting talk I was particular fascinated about her comments about ''Compression ratios'' in hit songs.

A good song depends on repetition – both of the tune and the lyrics. Too much repetition and it is just boring; too little, and it can lack structure [16].

Turns out that hit songs has just a little bit more repetition than average songs, and that repetion has gone up in recent years...

Indeed, there is a lot to notice, if you (a datascientist) begin to analyze songs [17].
Fascinating, indeed.

With more here: ''Song Classification'', ''The Perfect Song for your Favourite Singer''.

2.2.6. Paper Plane Competition.

The winner of the Paper Plane Competition won €500 [18].

2.2.6. Rule Induction and Reasoning in Knowledge Graphs.

Daria Stepanova, Bosch Center for AI, talked about ''Rule Induction and Reasoning in Knowledge Graphs''.

Advances in information extraction have enabled the automatic construction of large knowledge graphs like DBpedia, YAGO, Wikidata or Google Knowledge Graph.
...
A particular emphasis is put on the problem of learning exception-enriched and numerical rules from highly biased and incomplete data.

According to Plato, we have that ''Knowledge is justified true belief''.
The only problem is getting that knowledge...

So, how do we know where Mirka Federer, the wife of tennis star Roger Federer, lives?
Well...

Married people live together.
Mirka is married to Roger.
Roger lives in Bottmingen.
We assume: Mirka lives in Bottmingen.

Now we ''just'' need to automate the process with:

Fact prediction.
Fact checking.
Data cleaning/Domain descriptions.
Etc.

More precisely described here and here.

Great stuff indeed.

2.2.7. Multi-modal question answering on text and tables.

Timo Möller, Deepset, talked about ''Multi-modal question answering on text and tables''.

In this talk we were introduced to the open-source framework Haystack:

Haystack is an open-source framework for building search systems that work intelligently over large document collections.
Recent advances in NLP have enabled the application of question answering, retrieval and summarization to real world settings and Haystack is designed to be the bridge between research and industry [19].

Haystack can used for:

Question Answering (Find documents relevant to the query) [20].
Summarization.
Document Search (Semantic document search) [21].
Question Generation (Output questions a document can answer) [22].

It all sounded pretty impressive, indeed!

2.2.8. Alquist, the social bot.

Jan Šedivý, CIIRC, Czech Technical University, talked about ''Alquist, the social bot''.

Alquist is a social bot developed by a group of doctoral students working in Conversational AI at CIIRC CTU.
...
Alquist carries an engaging and entertaining dialog about popular topics such as sports, celebrities, movies, etc.

''Alquist (Czech Technical University) won the Alexa Prize SocialBot Grand Challenge 4 competition''.
Where the Alexa Prize is:

A series of competitions for university students dedicated to accelerating the field of artificial intelligence. Participating teams will advance several areas of AI through generalizable methodologies such as continuous learning, teachable AI, multimodal understanding, and reasoning [23].

The ''end goal'' of the competition (not reached in this years competition) is:

Focused on creating a SocialBot, an Alexa skill that converses coherently and engagingly with humans on popular topics and news events for 20-minutes, and achieve an average rating of at least 4.0/5.0.

But, clearly, Alquist is pushing in that direction...

And, with FlowStorm you can:

Connect your digital persona to a 3D avatar and enjoy a conversation including facial expressions that make the interaction more human [24].

It is all pretty impressive, indeed.

2.2.9. Demo: Realtime Crowd Insights.

By now it is almost a tradition that I spend some time at the ''Realtime Crowd Insights'' demo booth at MlPrague (Microsoft Cognitive Services).

I did so in 2019 (See section 4.3), and again this year.

Indeed, it is loads of fun!

2.3. Sunday. May 29th.

2.3.1. Deep Learning Discovery of new Exoplanets.

Hamed Valizadegan (on Zoom) talked about Deep Learning Discovery of new Exoplanets.

There are a number of ways to detect exo-planets.
One way is: ''If a planet crosses (transits) in front of its parent star's disk, then the observed visual brightness of the star drops by a small amount, depending on the relative sizes of the star and the planet [25]''.

Just looking at the raw pixels from an observation is extremely difficult though.

In the project ExoMiner, Hamed Valizadegan and colleagues have then used deep learning to classify and validate potential exoplanet finds.

The Kepler and TESS missions have generated over 100,000 potential transit signals that must be processed in order to create a catalog of planet candidates.
So, during the last few years, there has been a growing interest in using machine learning to analyze these data in search of new exoplanets.

So far, May 2022, ExoMiner has validated more than 301 new Exoplanets.

Exciting stuff, indeed!

2.3.2. Deploying transformers at scale.

Pieter Luitjens, Private AI talked about Deploying transformers at scale.

Transformer networks have taken the NLP world by storm, powering everything from sentiment analysis to chatbots. However, the sheer size of these networks presents new challenges for deployment, such as how to provide acceptable latency and unit economics.

Following in the footsteps of other ''Notes from Industry'' presentations on ''howto'' actually deploy such models in real life [26]. We were (also here) given some insights into transformer deployment, along with some notes of the performance (when) using the presented tools (See more at Private AI):

From the Deploying Transformers at Scale talk

All pretty cool indeed.

2.3.3. Conclusion. Sunday.

Indeed, a great conference. With many memorable talks.

Indeed, all in all, super interesting, and certainly thoughts and material to consider for future classes in Deep Learning...

3. Trip Impressions.

Impressions from the trip to Prague.

3.1. Money.

The currency in Czech Republic is the Czech Koruna.
The Koruna is one of the European Union's 11 currencies, and the Czech Republic is legally bound to adopt the euro currency in the future.

So, well, exchanged some Euro's before I landed in Prague.

The Czech koruna, also known as the Czech crown.

3.2. Travel.

Flew with Lufthansa to Prague.

Frankfurt airport.

3.3. Hotel Prague.

Stayed at the Charles Bridge Palace Hotel in Prague.

3.4. Kafka.

Hlava Franze Kafky
(outside the Quadrio shopping centre in Prague).

For more about Kafka, see my 2019 trip to Prague (section 5.4).
That is: Other statues, birthplace, the insurance company (''Worker's Accident Insurance Institute for the Kingdom of Bohemia''), Cafe Louvre and much more.