Faculty Achievement View More Faculty Achievements

Faculty Achievement

Published Works
IBA faculty co-authors a paper on self-supervised machine learning by a generative pretext task of isotropic Gaussian noise prediction

Dr. Tahir Syed, Assistant Professor, Department of Computer Science, School of Mathematics and Computer Science, co-authored a research paper, titled 'Self-supervision for tabular data by learning to predict additive homoskedastic Gaussian noise as pretext', published in the Journal of ACM Transactions on Knowledge Discovery from Data.

Summary
In much of machine learning, humans provide the correct responses (labels) to individual data, and the algorithm tries to imitate those on new data. However, data may contain millions of points, and human intervention at scale is asymptotically infeasible. Self-supervision uses data to generate labels for a new predictive task, getting better at which helps unlock hidden potential in the data to perform well on a given supervised prediction task with possibly a smaller number of labels. The first task, which we 'makeup' and generate labels for, is called the pretext task. We invent a pretext that adds noise to data and make them change locations by small amounts in feature space. Under controlled circumstances and appropriate assumptions, predicting how much a point has moved is our pretext task. We develop the theory that expresses this as a type of hypothesis space shaping, and an algorithm to apply the pretext under various settings, including with varying amounts of supervision on the main task. We observe that our algorithm's predictive performance is astoundingly close to that of full supervision with just 1% of the original labeled examples being used.

The article can be accessed here.


Published Works
IBA faculty co-authors research paper on Deep Generative Models to Counter Class Imbalance

Dr. Tahir Syed, Assistant Professor – Department of Computer Science, School of Mathematics and Computer Science, has co-authored a paper titled, 'Deep Generative Models to Counter Class Imbalance: A Model-Metric Mapping with Proportion Calibration Methodology'. The research paper has been published in the high-impact HEC-recognized journal, IEEE Access with a W-ranking.

The models used in the paper are GAN, VAE and RBM and the metrics include Precision, Recall, F1-Score, AUC, G-Mean and Balanced Accuracy. The authors compare these models to the established class of data synthesizing analogues.

Abstract

Re-sampling-based methods are the most often used techniques for addressing class imbalance in machine learning. The introduction of deep generative models for increasing the size of under-represented classes raises the question of whether the model used for data augmentation is compatible with the metric selected for classification quality. In machine learning, resampling-based methods are the most used techniques for resolving class imbalance. The use of deep generative models to increase the number of under-represented classes poses the question of whether the data augmentation model is compatible with the classification quality metric.

For more information:
https://scholar.google.com.pk/citations?view_op=view_citation&hl=en&user=Y_kxezwAAAAJ&sortby=pubdate&citation_for_view=Y_kxezwAAAAJ:L8Ckcad2t8MC


Published Works
IBA faculty co-authors research paper on pooling method based on Zeckendorf’s number series

Dr. Tahir Syed, Assistant Professor – Department of Computer Science, has co-authored a paper titled, “A New Pooling Approach Based on Zeckendorf’s Theorem for Texture Transfer Information”. The research paper has been published in the high-impact HEC-recognized Entropy journal, with a W-ranking.

Dr. Tahir has worked extensively with fundamental Machine Learning problems such as class imbalance and distribution drift and specializes in designing new losses for optimizing neural networks for specific needs. He was nominated to design Pakistan’s first graduate Data Science curriculum and in 2020, when the IBA Karachi began the said programme, he joined the IBA Karachi family.

Abstract

The pooling layer is at the heart of every convolutional neural network (CNN) contributing to the invariance of data variation. This paper proposes a pooling method based on Zeckendorf’s number series. The maximum pooling layers are replaced with Z pooling layer, which capture texels from input images, convolution layers, etc. It is shown that Z pooling properties are better adapted to segmentation tasks than other pooling functions. The method was evaluated on a traditional image segmentation task and on a dense labeling task carried out with a series of deep learning architectures in which the usual maximum pooling layers were altered to use the proposed pooling mechanism.

For more information: https://www.mdpi.com/1099-4300/23/3/279.


Published Works
IBA faculty co-authors research paper on integrated prediction system for a retail petrol station

Dr. Tahir Syed, Assistant Professor, Department of Computer Science, has co-authored a paper titled, 'Real-time forecasting of petrol retail using dilated causal CNNs'. The research paper has been published in the high-impact HEC-recognized Journal of Ambient Intelligence and Humanized Computing, with a W-ranking.

Dr. Tahir has worked extensively with fundamental Machine Learning problems such as class imbalance and distribution drift and specializes in designing new losses for optimizing neural networks for specific needs. He was nominated to design Pakistan's first graduate Data Science curriculum and in 2020, when the IBA Karachi began the said program, he joined the IBA Karachi family.

Abstract

The recent popularity of smart cities and smart homes has made the adoption of Internet of Things (IoT) devices ubiquitous. Most of these IoT devices are low-end devices with limited capabilities. For neural network based predictive models, the low processing power of connected things is a limitation when training them. In addition, it is still a common practice to deploy these models on cloud servers that possess dedicated high performance computing hardware. However, for IoT applications, it is not feasible to send voluminous raw data to the cloud or a remote backend server on account of high latency, information security concerns or lack of network coverage. In this work, we develop an integrated prediction system for a retail petrol station within the operational constraints of the IoT ecosystem. Our main contribution is the combination of the recent concepts of dilated convolution and the so-called causal convolution into the 1D dilated causal convolutional neural network for time-series prediction.

For more information, click here.


Published Works
IBA faculty co-authors research paper on emerging Big Data challenges.

Dr. Tahir Syed, Assistant Professor – Department of Computer Science, has co-authored a paper titled, "Potential Deep Learning Solutions to Persistent and Emerging Big Data Challenges—A Practitioners' Cookbook". The research paper has been published in the high-impact HEC-recognized ACM Computing Surveys journal, with a W-ranking.

Dr. Tahir has worked extensively with fundamental Machine Learning problems such as class imbalance and distribution drift, and specializes in designing new losses for optimizing neural networks for specific needs. He was nominated to design Pakistan's first graduate Data Science curriculum and in 2020, when the IBA Karachi began the said program, he joined the IBA Karachi family.

Abstract:

The phenomenon of Big Data continues to present moving targets for the scientific and technological state-of-the-art. This work demonstrates that the solution space of these challenges has expanded with deep learning now moving beyond traditional applications in computer vision and natural language processing to diverse and core machine learning tasks such as learning with streaming and non-iid-data, partial supervision, and large volumes of distributed data while preserving privacy. We present a framework coalescing multiple deep methods and corresponding models as responses to specific Big Data challenges. First, we perform a detailed per-challenge review of existing techniques, with benchmarks and usage advice, and subsequently synthesize them together into one organic construct that we discover principally uses extensions of one underlying model, the auto-encoder. This work therefore provides a synthesis where challenges at scale across the Vs of Big Data could be addressed by new algorithms and architectures being proposed in the deep learning community. The value being proposed to the reader from either community in terms of nomenclature, concepts, and techniques of the other would advance the cause of multi-disciplinary, transversal research and accelerate the advance of technology in both domains.

For more information: https://dl.acm.org/doi/abs/10.1145/3427476