GC
Gaurab Chhetri
Building things that make a difference.
Austin, TX6:33 PM

Hello World 👋! I'm a software engineer and student researcher at Texas State University. Since starting my journey in 2020, I've built impactful projects across web development, AI, machine learning, and data science. Skilled in TypeScript, JavaScript, Python, React, Next.js, and more, I focus on shipping products that matter. I believe in the mantra: “Do what you want, not what you can!”

featuredProjects
PersonalBhanai - A Custom Programming Language with a Nepali Touch
Bhanai - A Custom Programming Language with a Nepali Touch

Bhanai is a simple and intuitive programming language with a Nepali touch. It leverages Node.js under the hood for execution, allowing users to create `.bhn` files and run them seamlessly.

JavaScript
ResearchCognitiveSky - Scalable Sentiment and Narrative Analysis for Decentralized Social Media
CognitiveSky - Scalable Sentiment and Narrative Analysis for Decentralized Social Media

CognitiveSky is an open-source research infrastructure for analyzing mental health narratives on Bluesky, combining real-time ingestion, NLP pipelines, and an interactive Next.js dashboard. Accepted for presentation at HICSS 2026. Preprint available on arXiv, final version will appear in official proceedings.

PythonTypeScript
PersonalComputeNepal - A Tech Blog and Open-Source Learning Platform
ComputeNepal - A Tech Blog and Open-Source Learning Platform

ComputeNepal is an independent blog and project hub featuring 200+ articles and open-source learning tools covering programming, AI, data science, and web development. Built to make technical education accessible for learners in Nepal and beyond.

JavaScriptPythonHTMLPHP
PersonalVidXiv - ArXiv Paper to Video Generator
VidXiv - ArXiv Paper to Video Generator

VidXiv automatically converts research papers from ArXiv into engaging narrated videos with scene-by-scene breakdowns, AI-generated scripts, and export-ready MP4s for YouTube or Shorts.

Python
experience

October 2024 - Present

Undergraduate Research Assistant

AIT Lab - TXST, San Marcos, TX

Created 25+ analytical/visualization tools (internal and open-source), processed 150k+ mobility/crash records, and accelerated AI-in-transportation workflows with reproducible pipelines, faculty dashboards, and standardized outputs. Authored or co-authored 5+ manuscripts published or online, with additional submissions (10+) in the pipeline; contributed methods, analysis, modeling, quality control, and documentation across Python, R, and JavaScript. Designed, implemented, and maintain the AIT Lab website in Next.js/Tailwind; 90+ Lighthouse SEO and performance scores; deployed on Vercel with publications, projects, team, and resources sections.

TypeScriptJavaScriptReact.jsNext.jsTailwind CSSGitPythonRData AnalysisLaTex
education

Expected Graduation: May 2028

Bachelor of Science in Computer Science

Texas State University, San Marcos

I am currently pursuing a Bachelor of Science in Computer Science at Texas State University, where I am learning and gaining hands-on experience in various aspects of computer science, in software development, data structures, algorithms, and web technologies.

Computer ScienceSoftware DevelopmentData StructuresAlgorithmsWeb TechnologiesFull Stack DevelopmentAI & Machine LearningData ScienceResearch
recentDevLogs
researchPublications

September 14, 2025 | arXiv preprint arXiv:2509.11444

CognitiveSky: Scalable Sentiment and Narrative Analysis for Decentralized Social Media

Gaurab Chhetri, Anandi Dutta, Subasish Das

The emergence of decentralized social media platforms presents new opportunities and challenges for real-time analysis of public discourse. This study introduces CognitiveSky, an open-source and scalable framework designed for sentiment, emotion, and narrative analysis on Bluesky, a federated Twitter or X.com alternative. By ingesting data through Bluesky's Application Programming Interface (API), CognitiveSky applies transformer-based models to annotate large-scale user-generated content and produces structured and analyzable outputs. These summaries drive a dynamic dashboard that visualizes evolving patterns in emotion, activity, and conversation topics. Built entirely on free-tier infrastructure, CognitiveSky achieves both low operational cost and high accessibility. While demonstrated here for monitoring mental health discourse, its modular design enables applications across domains such as disinformation detection, crisis response, and civic sentiment analysis. By bridging large language models with decentralized networks, CognitiveSky offers a transparent, extensible tool for computational social science in an era of shifting digital ecosystems.

September 14, 2025 | arXiv preprint arXiv:2509.11443

A Transformer-Based Cross-Platform Analysis of Public Discourse on the 15-Minute City Paradigm

Gaurab Chhetri, Darrell Anderson, Boniphace Kutela, Subasish Das

This study presents the first multi-platform sentiment analysis of public opinion on the 15-minute city concept across Twitter, Reddit, and news media. Using compressed transformer models and Llama-3-8B for annotation, we classify sentiment across heterogeneous text domains. Our pipeline handles long-form and short-form text, supports consistent annotation, and enables reproducible evaluation. We benchmark five models (DistilRoBERTa, DistilBERT, MiniLM, ELECTRA, TinyBERT) using stratified 5-fold cross-validation, reporting F1-score, AUC, and training time. DistilRoBERTa achieved the highest F1 (0.8292), TinyBERT the best efficiency, and MiniLM the best cross-platform consistency. Results show News data yields inflated performance due to class imbalance, Reddit suffers from summarization loss, and Twitter offers moderate challenge. Compressed models perform competitively, challenging assumptions that larger models are necessary. We identify platform-specific trade-offs and propose directions for scalable, real-world sentiment classification in urban planning discourse.

September 14, 2025 | arXiv preprint arXiv:2509.11449

Tabular Data with Class Imbalance: Predicting Electric Vehicle Crash Severity with Pretrained Transformers (TabPFN) and Mamba-Based Models

Shriyank Somvanshi, Pavan Hebli, Gaurab Chhetri, Subasish Das

This study presents a deep tabular learning framework for predicting crash severity in electric vehicle (EV) collisions using real-world crash data from Texas (2017-2023). After filtering for electric-only vehicles, 23,301 EV-involved crash records were analyzed. Feature importance techniques using XGBoost and Random Forest identified intersection relation, first harmful event, person age, crash speed limit, and day of week as the top predictors, along with advanced safety features like automatic emergency braking. To address class imbalance, Synthetic Minority Over-sampling Technique and Edited Nearest Neighbors (SMOTEENN) resampling was applied. Three state-of-the-art deep tabular models, TabPFN, MambaNet, and MambaAttention, were benchmarked for severity prediction. While TabPFN demonstrated strong generalization, MambaAttention achieved superior performance in classifying severe injury cases due to its attention-based feature reweighting. The findings highlight the potential of deep tabular architectures for improving crash severity prediction and enabling data-driven safety interventions in EV crash contexts.

August 26, 2025 | arXiv preprint arXiv:2508.19239

Model Context Protocols in Adaptive Transport Systems: A Survey

Gaurab Chhetri, Shriyank Somvanshi, Md Monzurul Islam, Shamyo Brotee, Mahmuda Sultana Mimi, Dipti Koirala, Biplov Pandey, Subasish Das

The rapid expansion of interconnected devices, autonomous systems, and AI applications has created severe fragmentation in adaptive transport systems, where diverse protocols and context sources remain isolated. This survey provides the first systematic investigation of the Model Context Protocol (MCP) as a unifying paradigm, highlighting its ability to bridge protocol-level adaptation with context-aware decision making. Analyzing established literature, we show that existing efforts have implicitly converged toward MCP-like architectures, signaling a natural evolution from fragmented solutions to standardized integration frameworks. We propose a five-category taxonomy covering adaptive mechanisms, context-aware frameworks, unification models, integration strategies, and MCP-enabled architectures. Our findings reveal three key insights: traditional transport protocols have reached the limits of isolated adaptation, MCP's client-server and JSON-RPC structure enables semantic interoperability, and AI-driven transport demands integration paradigms uniquely suited to MCP. Finally, we present a research roadmap positioning MCP as a foundation for next-generation adaptive, context-aware, and intelligent transport infrastructures.