Most AutoML tools are black-box tools. They offer no code/low code tools (UI/simple APIs) for practitioners to get started quickly. While this helps beginners, most experienced data scientists/ML practitioners often need more control. Building a predictive model is an iterative process, so such restricted behavior of AutoML tools leads to the limited use of it. Keeping the real-life data scientist in mind, we created a programming model called “Gradual AutoML”. It borrows some concepts from functional programming and addresses the entire spectrum of controlled automation. Gradual AutoML allows the data scientist to be in the driver’s seat and use AutoML for assisted driving.
This talk will cover the basics of AutoML and then present Lale (https://github.com/IBM/lale), an open-source scikit-learn compatible AutoML library which implements Gradual AutoML. It will include usage examples and code showing how ML practitioners can control certain choices and employ AutoML to do the rest. I will also briefly share how to use Lale for AutoML with imbalance correction, computation of fairness metrics and bias mitigation. The talk assumes some familiarity with the Python ML ecosystem, but many of the concepts apply to the general AutoML framework
Interview:
What's the focus of your work these days?
I work on AI research. The session I'm going to conduct is on AutoML or AutoAI. Right now, I'm working on AutoAI with foundation models in mind, which are the large language models, the latest in AI.
What's the motivation for your talk at QCon New York 2023?
Most of the commercial or open-source AutoML tools today are black-box tools. For data scientists or ML practitioners who want to use the optimization techniques that AutoML provides, they have a very black-box interface. They can give their data, tasks, and maybe some other hyperparameters, but that's it. What we want to achieve is to give more control to the data scientists, so they can inject their domain knowledge and intuition into the AutoAI process. Instead of being a black-box tool, they can have control and provide algorithm choices, hyperparameters, or even the search space. This way, they can try out an iterative process for AutoAI.
How would you describe your main persona and target audience for this session?
Ideally, I think I would expect them to have some knowledge of using ML, and it would be even better if they have knowledge of the Python ecosystem for machine learning, which includes open-source libraries like Pandas and scikit-learn. If they have used AutoAI, that's great, but if they haven't, I would cover the basics of what it means to take assistance from AutoAI.
Is there anything specific that you'd like people to walk away with after watching your session?
Yes, I would like them to walk away with the understanding that AutoAI is not daunting or a black-box. They can control a lot of things and even perform complex tasks with it. They can drive how it searches and uses optimization. If they want to tackle complex use cases like imbalance correction or fairness mitigation, that's all possible. They should use the right tool to leverage that.
Speaker
Kiran Kate
Senior Technical Staff Member @IBM Research
Kiran is a Senior Technical Staff Member working in the AI Programming Models department at IBM Research. She has been working in ML/AI for the past 13+ years and has built several solutions and frameworks using machine learning. She has published in top AI conferences and has filed patents in this area. Kiran has a master’s in computer science from Indian Institute of Technology, Madras.