Lee Boonstra and Dmitriy Novakovskiy gave the keynote.
Python at google. Python is widely used at google, it is one of its official languages. It is integral to many of the google projects, for instance youtube and 'app engine'. And lots of open source libraries. Every API has its own python client.
Google for python developers. What can you do as a python programmer on google? Google cloud platform. It consists of many parts and services that you can use.
- You can embed machine learning services like tensorflow, speach API, the translation API, etc.
- Serverless data processing and analytics. Pub/Sub, bigquery (map/reduce without needing to run your own hadoop cluster), etc.
- Server stuff like kubernetes, container services.
Machine learning. There have been quite some presentation on this already. Look at it like this: how do you teach things to your kids? You show them! "That is a car", "that is a bike". After a while they will start learning the words for the concepts.
Machine learning is not the same as AI (artificial intelligence). AI is the process of building smarter computers. Machine learning is getting the computer to actually learn. Which is actually much easier.
Why is machine learning so popular now? Well:
- The amount of data. There is much more data. So we finally have the data we need to do something with it.
- Better models. The models have gotten much better.
- More computing power. Parallellization and so. You now have the power to actually do it in reasonable time.
Why google? Tensorflow is very popular (see an earlier talk about tensorflow).
You can do your own thing and use tensorflow and the cloud machine learning engine. OR you can use one of the google-trained services like the vision API (object recognition, text recognision, facial sentiment, logo detection). Speech API/natural language API (syntax analysis, sentiment analysis, entity recognision). Translation API (realtime subtitles, language detection). Beta feature: the video intelligence API (it can detect the dogs in your video and tell you when in the video the dogs appeared...).
Code and demos. She gave a nice demo about what she could recognize with google stuff in an Arjan Robben image. It even detected the copyright statement text at the bottom of the photo and the text on his lanyard ("champions league final"). And it was 88% sure it was a soccer player. And 76% sure it might be a tennis player :-)
Using the API looked pretty easy. Nice detail: several textual items that came back from the API were then fed to the automatic translation API to convert them to Dutch.
Tensorflow demo. He used the MNIST dataset, a set of handwritten numbers often used for testing neural nets.
Dataflow is a unifield programming model for batchor stream data processing. You can use it for map/reduce-like operations and "embarrassingly parallel" workloads. It is open sourced as apache Beam (you can use it hosted on google's infrastructure).
The flow has four steps:
- Cloud storage (storage of everything).
- Bigquery (data storage).
- Data studio (data visualization).
(The demo code can be found in the sheets that will be available, googling for it probably also helps).
Photo explanation: just a nice unrelated picture from the my work-in-progress german model railway
Dutch note: python+django programmeren in hartje Utrecht bij de oude gracht? Watersector, dus veel data en geo. Leuk! Nelen&Schuurmans is op zoek. Stuur mij maar een mailtje, want de vacaturetekst staat nog niet online :-)