For any Query/ more Information about the event, Contact us

Knowdis Machine Learning Day 2020

5th December 2020


About


KnowDis Data Science and the School of AI, IIT Delhi bring you the second edition of KnowDis Machine Learning Day. This will be an online event to be held on 5th December 2020.
The first edition of KnowDis Machine Learning Day was organized at IIT Delhi on July 28, 2019. The event celebrated the advances made in the field of Machine Learning and Artificial Intelligence and recognized the contribution of leading researchers and professors to the field.

Event Videos


KnowDis Machine Learning Day in Press


List of Awardees


...

KnowDis ML Award

Dr. Parag Singla

Associate Professor, Department of Computer Science and Engineering, IIT Delhi

List of Speakers


Details of Awardees


...
KnowDis Award for Excellence

Dr. S. N. Maheshwari

Honorary Professor
Former Head of the Department, Computer Science & Engineering

Indian Institute of Technology, Delhi


Research Interests: Algorithms, Parallel Processing, Information Systems

KnowDis Machine Learning Award

Dr. Parag Singla

Associate Professor, Dept. of Computer Science and Engineering

Indian Institute of Technology, Delhi

Research Interests: Statistical Relational Learning (SRL) which aims to combine the power of logic and probability. One of the developers of Alchemy, the first open source implementation of Markov Logic, and efficient inference techniques for Computer Vision problems

...

Details of Speakers


...
Speaker

Dr. Dennis Shasha

Prof. Computer Science Dept.
Courant Institute of Mathematical Sciences
Associate Director & Professor
NYU Wireless

Research Interests : Biological computing, pattern recognition and querying in trees and graphs, pattern discovery in time series, cryptographic file systems, database tuning, and wireless

Topic:- SafePredict: Reducing Errors by Refusing to Guess ( Occasionally )

Abstract: We propose a meta-algorithm to reduce the error rate of state-of-the-art machine learning algorithms by refusing to make predictions in certain cases even when the underlying algorithms suggest predictions.

Intuitively, our SafePredict approach estimates the likelihood that a prediction will be in error and when that likelihood is high, the approach refuses to go along with that prediction. Unlike other approaches, we can probabilistically guarantee an error rate on predictions we do make (denoted the {\em decisive predictions}).

Empirically on seven diverse data sets from genomics, ecology, image-recognition, and gaming,, our method can probabilistically guarantee to reduce the error rate to 1/4 of what it is in the state-of-the-art machine learning algorithm at a cost of between 11% and 58% refusals. Competing state-of-the-art methods refuse at roughly twice the rate of ours.

Speaker

Dr. Parag Singla

Associate Professor, Dept. of Computer Science and Engineering
Indian Institute of Technology, Delhi

Research Interests : Statistical Relational Learning (SRL) which aims to combine the power of logic and probability. One of the developers of Alchemy, the first open source implementation of Markov Logic, and efficient inference techniques for Computer Vision problems

Topic:- Three Advances in the space of Neuro Symbolic Reasoning

...

Abstract: In this talk, we will look at three different advances in the broad space of neuro symbolic reasoning. First, we will look at the problem of incorporating test time evidence in multi-task learning set-up to improve prediction accuracy. We will show that a simple idea of backpropagating evidence from an auxiliary task can result in significantly improved predictions. Second, in the space of Adversarial Autoencoders, we will show that optimal generation can only be achieved when assumed latent dimension is equal to the actual latent dimension. Since, the actual dimension may not be known in practice, we present a solution based on masking spurious dimensions in the latent space. Finally, we will present some recent work on solving problems in structured output spaces, such as solving a sudoku puzzle, exploiting the idea of solution-multiplicity. We will show that when one is interested in only one among multiple possible correct solutions, training loss has to be accordingly adapted. We propose an RL based framework to decide which solution to train on given an input, resulting in significantly improved prediction performance.


Bio: Parag Singla is an Associate Professor in the Department of Computer Science and Engineering at IIT Delhi. He holds a Bachelors in Computer Science and Engineering (CSE) from IIT Bombay (2002) and Masters in CSE from University of Washington Seattle. He received his PhD from University of Washington Seattle in 2009 and did a PostDoc from University of Texas at Austin during the period 2010-11. He has been a faculty member in the Department of CSE at IIT Delhi since December 2011. Parag’s primary research interests lie in the areas of machine learning, specifically focusing on neuro symbolic reasoning. In the past, he has also worked extensively on graphical models and statistical relational AI. Parag has close to 30 publications in top-tier peer reviewed conferences and journals. He also has one best paper award and two patents to his name. He is a recipient of the Visvesvaraya Young Faculty Research Fellowship by Govt. of India.

...
Speaker

Dr. Mausam

Prof. Department of Computer Science and Engineering
Indian Institute of Technology, Delhi

Research Interests: Neuro-symbolic machine learning, intelligent information systems, natural language processing, and NLP for robotics




KnowDis Machine Learning Medal Awardee, 2019
Speaker

Dr. Zack Dvey-Aharon

Co-Founder and CEO

AEYE Health

Professional Interests: AI, Data mining, Machine Learning, Pattern Recognition, Data Analytics, System Design, Entrepreneurship, Capital Markets Analytics, Cloud Computing, EEG & Brain-Machine Interface, On-line Algorithms, Game Theory, Complexity, System Design, System Security, Operating Systems, Project Management.

Topic:- Opportunities and Challenges in Retinal Imaging Analytics

...

Abstract: Retinal imaging is a powerful diagnostic tool. Due to the access and direct view it provides to blood vessels and the nerve system, fundoscopy enables the analysis of systemic conditions on top of sight-threatening indications. In our talk, we'll cover the huge potential that analytics of retinal images holds, as well as the challenges that emerge in its path.

...
Speaker

Dr. Daniel Ting MD,PhD


Head, AI and Digital Innovation
Singapore Eye Research Institute

Assistant Professor, Duke-NUS
Medical School, Singapore

Topic:- AI in Health- The 5 Rights

Abstract: The advent of artificial intelligence (AI), big data and next generation tele-communication network (5G) has generated enormous interest in digital health. Digital health comprises overlapping areas ranging from AI, the internet of things (IoT), electronic health and telehealth to the analysis and use of big data. With substantial innovation opportunities in digital health, the World Health Organization (WHO) has published a set of guidelines earlier this year, advising potential researchers and innovators on how to harness this technology to create evidence-based interventions within real-world settings to improve patients’ outcome. This talk aims to highlight some of the important principles in building the AI algorithms for health.

Speaker

Prof. Dean Ho

Director, The N.1 Institute for Health (N.1) & The Institute for Digital Medicine (WisDM) NUS

National University of Singapore

Reasearch Interests: CURATE.AI-based clinical studies,solid cancer and blood cancer therapy, digital therapeutics/personalised learning and post-organ transplant immunosuppression, among others.

Topic:-Optimising Drug Development and N-of-1 Healthcare with Digital Medicine

...

Abstract: In the quest for truly optimised medicine, multiple challenges need to be overcome - The right drugs and corresponding doses need to be identified, which can be insurmountable given the very large parameter space created. In addition, a one-size-fits-all approach serves as a barrier to individualising treatment, as even effective drugs given at incorrect dosages can result in little to no efficacy. Furthermore, these doses may need to be modulated dynamically during the course treatment, since the patient response to treatment can also be dynamic. Addressing all of these factors can be accelerated by the intersection of novel technology platforms with clinical trial/regulatory innovation. This lecture while highlight the clinical programs of the Institute for Digital Medicine (WisDM) and N.1 Institute for Health (N.1). We will discuss our recent advances in clinical trials innovation and the clearance of first-in-class patient studies, as well as results from our ongoing clinical development studies. The ultimate objectives of WisDM and N.1, which are already being observed in the clinic, are to dynamically tailor patient-specific treatment outcomes, reduce healthcare costs, and increase accessibility to practice-changing and optimised medicine.


Bio: Prof. Dean Ho is Provost’s Chair Professor, Director of The N.1 Institute for Health (N.1), Director of the Institute for Digital Medicine (WisDM), and Head of the Department of Biomedical Engineering at the National University of Singapore. Using his CURATE.AI platform, Prof. Ho has led multiple pioneering clinical studies that have validated the promise of N-of-1 medicine, where only a patient’s own data is used to personalise their treatment for the entire duration of care. Multiple CURATE.AI-based clinical studies are ongoing or cleared for start in the areas of solid cancer and blood cancer therapy, digital therapeutics/personalised learning, and post-organ transplant immunosuppression, among others. Prof. Ho is an elected member of the US National Academy of Inventors (NAI). He is also a Fellow of the American Institute of Medical and Biological Engineering (AIMBE) and Society for Laboratory Automation and Screening, as well as a Fulbright Scholar. He is also a recipient of the NSF CAREER Award, Wallace H. Coulter Foundation Translational Research Award, and V Foundation for Cancer Research Scholar Award, among others. Prof. Ho was recently named to the Asia Tatler Gen.T List, which honours young leaders who are shaping the future of the region. Prof. Ho has appeared on the National Geographic Channel Program “Known Universe” to discuss his discoveries in drug delivery and imaging. His discoveries have been featured on CNN, The Economist, Forbes, Washington Post, NPR and other international news outlets. He has served as the President of the Board of Directors of the Society for Laboratory Automation and Screening (SLAS), a 26,000+ member global drug development organization comprised of senior executives from the pharmaceutical and medical device sectors, as well as academic thought leaders.

...
Speaker

Dr. Vikram K. Mulligan

Research Scientist
Flatiron Institute

Research Interests: Research Interests: Computational methods for designing folding heteropolymers, using modern machine learning approaches on classical computers, plus new quantum algorithms on current-generation quantum computers.


Topic:- Advancing peptide therapeutic design with modern and emerging computational technologies

Abstract: Synthetic peptide macrocycles represent an attractive class of potential drugs, combining the large recognition surfaces of antibodies with the ease of production and favourable pharmacokinetic properties of small molecules. A current challenge is mitigating the inherent conformational flexibility of peptides, which tends to limit target affinity. In this presentation, I will review my work to generalize the Rosetta software suite, which has historically been used for protein design and structure prediction, to permit the design of peptide macrocycle sequences that are optimized both to make favourable interactions with a target protein and to maximize rigidity of the macrocycle in a binding-competent conformation in the absence of the target. I will present examples of computationally-designed peptide macrocycle inhibitors of key proteins in pathogenic bacteria and viruses, including the SARS-CoV-2 virus. I will show early successes in mapping design algorithms to emerging computational technologies including current- and near-future generation quantum computers. Finally, I will present ongoing work to reduce the computational cost of finding hits using deep convolutional neural networks to approximate the output of computationally-expensive simulations.

Speaker

Dr. Avantika Lal

Senior Scientist in Deep Learning and Genomics
NVIDIA

Research Interests: Genomics, Functional genomics, Next-generation sequencing (NGS), Molecular biology, Microbiology


Topic:- Accelerating Single-cell Genomics with Machine Learning and Deep Learning

...

Abstract: Single-cell genomics experiments measure the properties of the genome in individual cells, enabling us to identify diverse cell types in the human body, and to study how various diseases affect each type of cell. Single-cell experiments are growing in size and complexity, but current analysis methods struggle to scale to large numbers of cells. At the same time, our ability to accurately identify the features of rare cellular populations from such datasets is limited. I will present recent work on deep learning and machine learning tools to improve analysis of single-cell genomic data, with examples of how these tools can be applied to understand the biology of COVID-19.

...
Speaker

Dr. Limsoon Wong

Professor, Computer Science
National University of Singapore (NUS)

Research Interests: Database Theory and Systems; Bioinformatics and Computational Biology; Knowledge Discovery and Data Mining

Topic:- From Bewilderment to Enlightenment in Cancer Research

Abstract: I have noticed that a lot of work on analysis of big data has focused on machine learning methods and these are often evaluated based on class-prediction accuracy. And a lot of computational research on cancer diagnostic and prognostic biomarkers based on omics data (which is big data in a biology context) are no exception. However, class-prediction accuracy is a quick but superficial way of determining classifier performance. It does not inform on the reproducibility of the findings or whether the selected or constructed features used are meaningful and specific. Furthermore, the class-prediction accuracy over-summarizes and does not inform on how training and learning have been accomplished. It also does not provide explain-ability in its decision-making process and is not objective, as its value is also affected by class proportions in the validation set. In the first part of my talk, I want to share with you a bewildering observation in cancer prognostic biomarkers research: Most random gene expression signatures (think of these as bio big data-derived classifiers for breast cancer survival) are significantly associated with breast cancer survival. This observation calls into question whether any of the reported gene expression signatures is more meaningful/useful than random ones. A corollary of this observation is that: If you write a paper that reports a breast cancer prognostic signature (or present a computational method to do so), and evaluates it based purely on prediction performance, the journal/reviewer should reject the paper without review. In the second part of my talk, I dissect this bewildering observation and explain the cause of this phenomenon, and show how it can be overcome (and how true prognostic signatures are distinguishable from random signatures.)


Bio: Limsoon Wong is a professor of computer science in the School of Computing at the National University of Singapore (NUS). He was also a professor (now honorary) of pathology in the Yong Loo Lin School of Medicine at NUS. Limsoon also co-founded Molecular Connections in India in the early 2000s; and as this start-up’s chairman, he oversaw its 400x growth over a decade and a half. As a scientist, Limsoon has many well-known results in two distinct fields: database theory and computational biology; and he was inducted in 2013 as a Fellow of the ACM for his seminar works in both fields. These days he works mostly on knowledge discovery technologies and their application to biomedicine, with a current special interest on batch effects in gene expression and proteomic profile analysis.

Speaker

Dr. Yang You

Assistant Professor (tenure-track)
National University of Singapore (NUS)

Research Interests: Machine Learning; High-Performance Computing; Parallel/ Distributed Systems

Topic:- Fast and Accurate Deep Neural Network Training

...

Abstract: In the last three years, supercomputers have become increasingly popular in leading AI companies. Amazon built a High Performance Computing (HPC) cloud. Google released its first 100-petaFlop supercomputer (TPU Pod). Facebook made a submission on the Top500 supercomputer list. Why do they like supercomputers? Because the computation of deep learning is very expensive. For example, even with 16 TPUs, BERT training takes more than 3 days. On the other hand, supercomputers can process 10^17 floating point operations per second. So why don’t we just use supercomputers and finish the training of deep neural networks in a very short time? The reason is that deep learning does not have enough parallelism to make full use of thousands or even millions of processors in a typical modern supercomputer. There are two directions for parallelizing deep learning: model parallelism and data parallelism. Model parallelism is very limited. For data parallelism, current optimizers can not scale to thousands of processors because large-batch training is a sharp minimum problem. In this talk, I will introduce LARS (Layer-wise Adaptive Rate Scaling) and LAMB (Layer-wise Adaptive Moments for Batch training) optimizers, which can find more parallelism for deep learning. They can not only make deep learning systems scale well, but they can also help real-world applications to achieve higher accuracy. Since 2017, all the Imagenet training speed world records have been achieved using LARS. LARS was added to MLperf, which is the industry benchmark for fast deep learning. Google used LAMB to reduce BERT training time from 3 days to 76 minutes and achieve new state-of-the-art results on GLUE, RACE, and SQuAD benchmarks. The approaches introduced in this talk have been used by state-of-the-art distributed systems at Google, Intel, NVIDIA, Sony, Tencent, and so on.

Bio: Yang You is an assistant professor (tenure-track) at National University of Singapore. He received his PhD in Computer Science from UC Berkeley. His advisor is Prof. James Demmel, who was the former chair of the Computer Science Division and EECS Department. Yang You's research interests include Parallel/Distributed Algorithms, High Performance Computing, and Machine Learning. The focus of his current research is scaling up deep neural networks training on distributed systems or supercomputers. In 2017, his team broke the world record of ImageNet training speed, which was covered by the technology media like NSF, ScienceDaily, Science NewsLine, and i-programmer. In 2019, his team broke the world record of BERT training speed. The BERT training techniques have been used by many tech giants like Google, Microsoft, and NVIDIA. Yang You’s LARS and LAMB optimizers are available in industry benchmark MLPerf. He is a winner of IPDPS 2015 Best Paper Award (0.8%), ICPP 2018 Best Paper Award (0.3%) and ACM/IEEE George Michael HPC Fellowship. Yang You is a Siebel Scholar and a winner of Lotfi A. Zadeh Prize. For more information, please check his lab’s homepage at https://ai.comp.nus.edu.sg/

...
Speaker

Dr. Chetan Arora

Associate Professor

Indian Institute of Technology, Delhi


Research Interests :Trustworthy AI, Cancer Detection, Mobility, Egocentric Vision, Social Impact

Topic:- Efficient algorithms for exploiting complex prior knowledge in neural network predictions

Speaker

Prof. Vaibhav Rajan

Assistant Professor, Department of Information Systems and Analytics

School of Computing, National University of Singapore (NUS)

Research Interests: Data Science & Business Analytics, Healthcare Informatics, Intelligent Systems


Topic:- Healthcare Analytics: Learning from Multiple Heterogeneous Data Sources

...

Abstract: The increasing availability of digitized clinical and genomic data presents an unprecedented opportunity to study and gain deeper understanding of diseases, develop new treatments and improve healthcare ecosystems. However, the data also poses substantial modelling challenges due to the heterogeneity of measurements involved. In this talk, I’ll describe a new deep learning based technique, developed in my group, for unsupervised learning of representations from arbitrary collections of matrices. I’ll outline how our technique can be used for predicting gene-disease and drug-target associations in cancer.


Bio: Vaibhav Rajan is an Assistant Professor in the Department of Information Systems and Analytics at the School of Computing, National University of Singapore (NUS). Earlier, he was a Senior Research Scientist at Xerox Research where he led a project on Clinical Decision Support Systems for over four years. He has also worked as a consultant at Hewlett-Packard Labs and as Chief Data Scientist at Videoken (an education technology startup). Vaibhav Rajan received his PhD and Master’s degrees in Computer Science from the Swiss Federal Institute of Technology at Lausanne (EPFL), Switzerland in 2012 and 2008 respectively and his Bachelor’s degree in Computer Science from Birla Institute of Technology and Science (BITS), Pilani, India in 2004. His research interests include Machine Learning, Algorithm Design and their applications, primarily in Healthcare and Bioinformatics. He is a recipient of the ERS IASC Young Researchers Award 2014 given by European Regional Section (ERS) of the International Association for Statistical Computing (IASC).

...
Speaker

Mr. Saurabh Singal

Founder and MD

KnowDis Data Science

Professional Interests : Deep learning for biology and medicine.


Topic:- Deep Learning for Molecular Discovery

Abstract: In recent years, Convolution Neural Networks have revolutionised image recognition. Word embeddings are an integral component of LSTM and Transformer models, and have led to rapid advances in the field of Natural Language Processing. Now, these technologies are making waves in the Life Sciences. Molecular Biology has successfully utilized embeddings to represent discrete biological data as real valued vectors, and convolution networks have been successfully used in representing Molecular Graphs, resulting in significant success in prediction of molecular properties. This, in turn, has led to big gains in the fields of Virtual Screening, Drug Discovery, and Drug Repurposing. This talk will review the recent developments in this new and fast growing field, along with an illustration of how Deep Learning methods are being used in the war against COVID-19.

Bio: Saurabh Singal holds a B.Tech. degree in Computer Science & Engineering from IIT Delhi and an M.S. from Carnegie Mellon University. He worked in leading investment banks (Deutsche Bank, Merrill Lynch and Credit Suisse) and hedge funds (Harborview, Manchester Partners) in the global financial hubs (Tokyo, New York, London and Singapore) as a derivatives trader. He went on to manage the Indea ANKAM Fund, a quant hedge fund in Singapore, which used highly sophisticated and intelligent algorithms for making trading decisions. After leaving the world of investment management, Saurabh founded Ankam, a Singapore based Machine Learning company which provided its clients with algorithmic solutions in the domains of Pharma and Finance. Saurabh founded KnowDis Data Science in 2018.

Speaker

Mr. Sanjay Kumar

Revenue Secy, GOA

IAS | Fulbright Scholar | Ex IAF Master in City Planning, MIT

Research Interests: Urbanisation and Property rights


Topic:- AI in Governance

...

Abstract: -


Bio: After his schooling in Delhi, in 1998, Shri Sanjay Kumar joined Indian Air Force. During his over a ten year period in IAF, he served at different Air Bases and completed graduation from Delhi university and post-graduation in Political Science.
Since 2008, he is a member of Indian Administrative Service. Initially, he served as SDM and District Magistrate in Andaman and Nicobar Islands. On 26th January 2014, he got Lt. Governor’s Commendation for the handling of aftermaths of Phailin Cyclone.
From 2014-17, he handled many assignments in Delhi such as DM (New Delhi district), Excise Commissioner, Delhi and Commissioner/Special Commissioner Transport, Delhi.
In 2017, he took study leave to pursue his interest in urbanisation and property rights. As a Fulbright scholar, he completed a Master degree in City Planning at MIT. During his degree, he did field research in Argentina and Rwanda on informal settlements and land reforms.
Currently, he is Secretary to Government of Goa and pursuing his passion to modernise property records in Goa.


Brought to you by-


About KnowDis Data Science


KnowDis Data Science is a Machine Learning company, led by Saurabh Singal, an alumnus of IIT-Delhi and Carnegie Mellon University. KnowDis Data Science provides machine-learning solutions to clients across e-commerce, healthcare, and finance domains. We work at the cutting edge of Deep Learning technologies for both NLP and Computer Vision.



About the School of AI


Indian Institute of Technology (IIT) Delhi has established an independent “School of Artificial Intelligence (ScAI)” on its campus. The new school will begin its PhD program from the next admission cycle – January 2021. The institute is also planning to offer postgraduate level degree courses at a later stage.