Architecture – AI4Biodiversity

AI and human-computational hybrid system

Our research group, aims to address the challenge of biodiversity monitoring through the development of novel hybrid human-computational systems. We are a multidisciplinary team of experts in ecology, artificial intelligence, online communities and human computation, and statistics. Our goal is to make sense of unsystematic biodiversity image data and unleash its scientific potential.

Data Collection: Integrated Data Collection Systems

Biodiversity loss is a pressing issue that necessitates immediate action. Traditional methods for estimating wildlife and plant population sizes suffer from high costs and dependence on trained personnel. Currently populations of only a handful of species are systematically monitored and in restricted regions. To address the challenges of biodiversity monitoring, we propose integrating and upscaling techniques such as image processing from drones and satellites, with citizen science projects and social media data, including observations from global internet databases such as iNaturalist and eBird.. By combining these various sources of data, we can create a more comprehensive and scalable approach to biodiversity monitoring.

Traditional collection methods

Upscale collection methods

Data Betterment: AI, Algorithms, and Human-Computational

Recent advances in deep learning detection algorithms have made automated classification more efficient and accurate, but identifying species and individuals in photographs remains challenging. To address this challenge, we incorporate human expertise into the process in two primary ways. First, we pay close attention to the confidence of our algorithms’ classification. When the algorithm’s classification confidence is low, we allocate the images to expert human verification. Second, we incorporate prior information about species’ spatio-temporal distribution and the species ethology into the process.

By combining this prior knowledge with the automated classification results, our system produces updated posterior classification probabilities. When the posterior probabilities differ significantly from the prior probabilities, human experts are consulted. This approach minimizes misclassification and alerts experts to unexpected results, potentially leading to new discoveries. Our system’s contributions extend beyond species identification and can be applied to other algorithmic-human image classification systems.

Data Aggregation: Statistics, Algorithms, and gaming

In recent years, global internet databases such as iNaturalist and eBird have allowed non-experts to share their fauna, flora and funga observations. However, the data accumulated in such citizen science projects is not as consistent and reliable as data collected through traditional scientific protocols. One important and common type of bias in citizen-science wildlife data is called “taxonomic bias”: the personal preferences and interests of people toward certain species cause these species to be overrepresented or underrepresented in the data.

To mitigate this taxonomic bias, we have devised a statistical learning methodology that models the individual preference of each observer toward a given target species. This allows us to debias the inference process regarding the unknown true encounter rate with the target species at a given location and time. We consider two inference approaches, based on the same underlying statistical model, to estimate the ratios of the unknown encounter rates across locations and times or the absolute encounter rates of the target species for the entire domain considered.

In addition to our statistical learning methodology that mitigates taxonomic bias in citizen-science biodiversity data, we are also exploring biases in biodiversity data using a controlled and standardized environments through VR gaming. By incorporating observer experience, we can comprehend human preferences and personal biases related to animal behavior, traits, and familiarity.