изучение языков

Gael Varoquaux- Dirty Data Science Machine Learning On Non Curated Data| PyData Global 2020

2 Просмотры
изучение языков
Cleaning data to analyze it is a major roadblock to data science. I will discuss two specific problems, missing values and categories which variants and typos, in the context of machine learning. This talk will be on recent publications but give simple solutions in Python.

I am a research director at Inria (French National Computer Science Research Institute), studying machine learning for health, as well as a visiting professor at McGill university. I have a strong academic track record in fundamental machine learning and mental health applications (many publications in the best venues such as NeurIPS and ICML, editor at elife, one of the reference life sciences journal).

I have been a contributor to the numeric Python and pydata stack since the mid 2000s, contributing to numpy, Mayavi, and later founding scikit-learn and joblib, as well as a few other domain-specific packages.

I have been talking about Python and data processing and teaching it for 15 years. I helped creating and curating the scipy lecture notes, and gave many tutorials as well as keynotes at various Python conferences.

PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R.

PyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.
Другие языки
Комментариев нет.