Designing and developing efficient solutions to complex problems

Publications & Projects

Identifying Autism Spectrum Disorder in fMRI Brain Scans

 

Thesis

2023 14th International Conference on Information, Intelligence, Systems & Applications (IISA)

Autism Spectrum Disorder (ASD) affects a large portion of the global population both directly and indirectly. The biological etiology of the disorder is not sufficiently understood, and current diagnoses rely on behavioural indicators which do not provide a reliable basis for diagnosis until about 2 years of age. Identifying a biological marker of ASD would aid in understanding the disorder and potentially allow for earlier, more objective diagnoses and treatments to improve the quality of life of individuals possessing ASD. The analysis of functional connectivity in the brain using functional Magnetic Resonance Imaging (fMRI) has been identified as a promising method for discovering such biological markers. This study recreated a prominent state-of-the-art work in explainable classification of brain networks, but found results inconsistent with what was claimed. The methods were modified in various ways to improve accuracy and performance. A new, simpler method named Discriminative Edges (DE) was developed which achieved similar accuracies with improved performance and explainability. DE was also adapted to receive raw correlation matrices as well as thresholded correlation matrices representing brain networks, and it was found that raw correlation matrices provided more useful information for classification. An imple-mentation package was provided to aid future researchers in validating and improving upon these results. Suggestions for future work based on the findings of this study were provided, the most important being to procure more datasets, discover data-driven subcategories of ASD, and maintain reproducibility in studies.

 

Two approaches to survival analysis of open source Python projects

 

ICPC ’22: 30th International Conference on Program Comprehension

A recent study applied frequentist survival analysis methods to a subset of the Software Heritage Graph and determined which attributes of an open source software project contribute to its health. This paper serves as an exact replication of that study. In addition, Bayesian survival analysis methods were applied to the same dataset, and an additional project attribute was studied to serve as a conceptual replication. Both analyses focus on the effects of certain attributes on the survival of open-source software projects as measured by their revision activity. Methods such as the Kaplan-Meier estimator, Cox Proportional-Hazards model, and the visualization of posterior survival functions were used for each of the project attributes. The results show that projects which publish major releases, have repositories on multiple hosting services, possess a large team of developers, and make frequent revisions have a higher likelihood of survival in the long run. The findings were similar to the original study; however, a deeper look revealed quantitative inconsistencies.

 

Triangle Enumeration for Billion-Scale Graphs in RDBMS

 

International Conference on Advanced Information Networking and Applications

Triangle enumeration is considered a fundamental graph analytics problem with many applications including detecting fake accounts, spam detection, and community searches. Real world graph data sets are growing to unprecedented levels and many of the existing algorithms fail to process them or take a very long time to produce results. Many organizations invest in new hardware and new services in order to be able keep up with the data growth and often neglect the well established and widely used relational database management systems (RDBMSs). In this paper we present a carefully engineered RDBMS solution to the problem of triangle enumeration for very large graphs. We show that RDBMSs are suitable tools for enumerating billions of triangles in billion-scale networks on a consumer grade server. Also, we compare our RDBMS solution’s performance to a native graph database and show that our RDBMS solution outperforms by order of magnitude.

 

To see the following projects, visit my github and observablehq profiles.