Prabin Bhandari

Fairfax, Virginia

Hi! I’m Prabin Bhandari, a Ph.D. Candidate in Computer Science at George Mason University. My research focuses on utilizing Large Language Models (LLMs) for geospatial data, information extraction, and agentic systems. I’m fortunate to be advised by Dr. Antonios Anastasopoulos and Dr. Dieter Pfoser and to be affiliated with the George Mason NLP Lab.

selected publications

2024

From Text to Maps: LLM-Driven Extraction and Geotagging of Epidemiological Data

Karlyn K. Harrod, Prabin Bhandari, and Antonios Anastasopoulos

In Proceedings of the Third Workshop on NLP for Positive Impact, Nov 2024

Outstanding Paper Award Abs Bib PDF

Epidemiological datasets are essential for public health analysis and decision-making, yet they remain scarce and often difficult to compile due to inconsistent data formats, language barriers, and evolving political boundaries. Traditional methods of creating such datasets involve extensive manual effort and are prone to errors in accurate location extraction. To address these challenges, we propose utilizing large language models (LLMs) to automate the extraction and geotagging of epidemiological data from textual documents. Our approach significantly reduces the manual effort required, limiting human intervention to validating a subset of records against text snippets and verifying the geotagging reasoning, as opposed to reviewing multiple entire documents manually to extract, clean, and geotag. Additionally, the LLMs identify information often overlooked by human annotators, further enhancing the dataset’s completeness. Our findings demonstrate that LLMs can be effectively used to semi-automate the extraction and geotagging of epidemiological data, offering several key advantages: (1) comprehensive information extraction with minimal risk of missing critical details; (2) minimal human intervention; (3) higher-resolution data with more precise geotagging; and (4) significantly reduced resource demands compared to traditional methods.
@inproceedings{harrod-etal-2024-text, title = {From Text to Maps: {LLM}-Driven Extraction and Geotagging of Epidemiological Data}, author = {Harrod, Karlyn K. and Bhandari, Prabin and Anastasopoulos, Antonios}, editor = {Dementieva, Daryna and Ignat, Oana and Jin, Zhijing and Mihalcea, Rada and Piatti, Giorgio and Tetreault, Joel and Wilson, Steven and Zhao, Jieyu}, booktitle = {Proceedings of the Third Workshop on NLP for Positive Impact}, month = nov, year = {2024}, address = {Miami, Florida, USA}, publisher = {Association for Computational Linguistics}, pages = {258--270}, outstanding_paper = {true} }
Urban Mobility Assessment Using LLMs

Prabin Bhandari, Antonios Anastasopoulos, and Dieter Pfoser

In Proceedings of the 32nd ACM International Conference on Advances in Geographic Information Systems, Nov 2024

Best Paper Award Abs Bib PDF

In urban science, understanding mobility patterns and analyzing how people move around cities helps improve the overall quality of life and supports the development of more livable, efficient, and sustainable urban areas. A challenging aspect of this work is the collection of mobility data through user tracking or travel surveys, given the associated privacy concerns, noncompliance, and high cost. This work proposes an innovative AI-based approach for synthesizing travel surveys by prompting large language models (LLMs), aiming to leverage their vast amount of relevant background knowledge and text generation capabilities. Our study evaluates the effectiveness of this approach across various U.S. metropolitan areas by comparing the results against existing survey data at different granularity levels. These levels include (i) pattern level, which compares aggregated metrics such as the average number of locations traveled and travel time, (ii) trip level, which focuses on comparing trips as whole units using transition probabilities, and (iii) activity chain level, which examines the sequence of locations visited by individuals. Our work covers several proprietary and open-source LLMs, revealing that open-source base models like Llama-2, when fine-tuned on even a limited amount of actual data, can generate synthetic data that closely mimics the actual travel survey data and, as such, provides an argument for using such data in mobility studies.
@inproceedings{10.1145/3678717.3691221, author = {Bhandari, Prabin and Anastasopoulos, Antonios and Pfoser, Dieter}, title = {Urban Mobility Assessment Using LLMs}, year = {2024}, isbn = {9798400711077}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, url = {https://doi.org/10.1145/3678717.3691221}, doi = {10.1145/3678717.3691221}, booktitle = {Proceedings of the 32nd ACM International Conference on Advances in Geographic Information Systems}, pages = {67–79}, numpages = {13}, keywords = {Large Language Models, Travel Data, Travel Survey, Travel Survey Data Simulation}, location = {Atlanta, GA, USA}, series = {SIGSPATIAL '24}, best_paper = {true} }

2023

Are Large Language Models Geospatially Knowledgeable?

Prabin Bhandari, Antonios Anastasopoulos, and Dieter Pfoser

In Proceedings of the 31st ACM International Conference on Advances in Geographic Information Systems, Nov 2023

Abs Bib PDF

Despite the impressive performance of Large Language Models (LLM) for various natural language processing tasks, little is known about their comprehension of geographic data and related ability to facilitate informed geospatial decision-making. This paper investigates the extent of geospatial knowledge, awareness, and reasoning abilities encoded within such pretrained LLMs. With a focus on autoregressive language models, we devise experimental approaches related to (i) probing LLMs for geo-coordinates to assess geospatial knowledge, (ii) using geospatial and non-geospatial prepositions to gauge their geospatial awareness, and (iii) utilizing a multidimensional scaling (MDS) experiment to assess the models’ geospatial reasoning capabilities and to determine locations of cities based on prompting. Our results confirm that it does not only take larger but also more sophisticated LLMs to synthesize geospatial knowledge from textual information. As such, this research contributes to understanding the potential and limitations of LLMs in dealing with geospatial information.
@inproceedings{10.1145/3589132.3625625, author = {Bhandari, Prabin and Anastasopoulos, Antonios and Pfoser, Dieter}, title = {Are Large Language Models Geospatially Knowledgeable?}, year = {2023}, isbn = {9798400701689}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, url = {https://doi.org/10.1145/3589132.3625625}, doi = {10.1145/3589132.3625625}, booktitle = {Proceedings of the 31st ACM International Conference on Advances in Geographic Information Systems}, articleno = {75}, numpages = {4}, keywords = {large language models, geospatial knowledge, geospatial awareness, geospatial reasoning}, location = {Hamburg, Germany}, series = {SIGSPATIAL '23}, }
Trustworthiness of Children Stories Generated by Large Language Models

Prabin Bhandari, and Hannah Brennan

In Proceedings of the 16th International Natural Language Generation Conference, Sep 2023

Abs Bib PDF

Large Language Models (LLMs) have shown a tremendous capacity for generating literary text. However, their effectiveness in generating children’s stories has yet to be thoroughly examined. In this study, we evaluate the trustworthiness of children’s stories generated by LLMs using various measures, and we compare and contrast our results with both old and new children’s stories to better assess their significance. Our findings suggest that LLMs still struggle to generate children’s stories at the level of quality and nuance found in actual stories.
@inproceedings{bhandari-brennan-2023-trustworthiness, title = {Trustworthiness of Children Stories Generated by Large Language Models}, author = {Bhandari, Prabin and Brennan, Hannah}, booktitle = {Proceedings of the 16th International Natural Language Generation Conference}, month = sep, year = {2023}, address = {Prague, Czechia}, publisher = {Association for Computational Linguistics}, pages = {352--361}, }