Projects

Things I’ve built.

Systems, networks, architecture, and a few side projects — mostly from coursework at IIT Delhi. Listed newest first. Most are in C/ C++ and Python.

Apr — May 2026

Deep Learning

Representation Learning, Vision–Language Modeling, and Diffusion

CLIP-style contrastive image–text pretraining, DINO self-supervised distillation, a two-stage vision–language model with chain-of-thought QA, and latent-diffusion generation — all from scratch on CLEVR-style data.

COL775 · A2 · w/ Yash Bansal, Tarang Shah

CLIP from scratch — ViT-S/16 image encoder, causal-masked text transformer, dual-projection (CLS + GAP) with auxiliary contrastive loss; DropPath and bicubic positional-embedding interpolation for higher-resolution probing.
DINO self-supervised distillation with multi-crop augmentation and centring; no text supervision.
Two-stage vision–language model: image–text alignment, then chain-of-thought QA with explicit numerical-stability treatment of the CoT logits.
VAE + Latent Diffusion Model trained end-to-end; quantitative comparison via linear probes, t-SNE, and cross-modal retrieval, plus qualitative generation analysis.

Apr — May 2026

Machine Learning

Visual Question Answering

A multimodal VQA system fusing a frozen ResNet-101 image encoder with a Transformer text encoder via multi-head cross-attention, finished with an MLP classifier.

COL774 · A4 · w/ Vipul Vaibhav

ResNet-101 backbone with the final pooling/FC layers removed; projected to a joint embedding via a trainable linear layer.
Transformer text encoder with learnable positional embeddings and padded-token masking; multi-head cross-attention from the [CLS] question token to image features for cross-modal fusion.
Progressive training schedule: frozen backbone → fine-tuned image encoder → further regularization; standard cross-entropy with Adam.
Zero-shot evaluation on held-out question types to probe compositional generalization.

Mar 2026

Deep Learning

Neural Machine Translation with Cross-Lingual Transfer

Four progressively stronger English → Hindi translation models, then transferred to a Hindi → Marathi task to exploit shared script and morphology.

COL775 · A1.2

Implemented four modes side-by-side: GloVe + BiLSTM without attention, GloVe + BiLSTM + Bahdanau attention, frozen-BERT encoder + attention, and fine-tuned-BERT encoder + attention.
Teacher-forcing schedule with annealed ratio, label smoothing, and beam-search decoding. Full hyperparameter sweep over learning rate, teacher-forcing ratio, and beam width.
Cross-lingual transfer: the English→Hindi model was adapted to Hindi → Marathi, exploiting the shared Devanagari script and overlapping morphology, then analysed qualitatively.

Mar 2026

Deep Learning

ResNet-18 from Scratch, with Seven Normalization Schemes

Built ResNet-18 from scratch on a 100-class ImageNet subset, then trained it under seven different normalization variants for a controlled comparison.

COL775 · A1.1

Every normalization implemented from scratch as nn.Module subclasses — Batch / Instance / Batch-Instance / Layer / Group / a custom BN / a no-norm baseline.
Full modern training recipe: SGD with cosine annealing, MixUp (α = 0.4), RandAugment, label smoothing, mixed-precision, gradient clipping — 100 epochs across all seven variants.
Sanity-checked the custom BN against PyTorch’s built-in (84.2% vs 84.5%; gap explained by Bessel-corrected variance).
Grad-CAM analysis across visually similar classes (snakes, reptiles) to interrogate what each normalization actually learned to attend to.

Feb — Apr 2025

Operating Systems

Kernel Extensions in xv6

Five subsystems added to the xv6 teaching OS — authentication, syscall access control, custom interrupt handling, a priority-boosted scheduler, and disk-backed page swapping.

Prof. Smruti R. Sarangi

Login authentication via Makefile macros with a retry-limited username/password system.
Syscall-level access control (block / unblock other syscalls) and a persistent syscall-history mechanism.
Custom interrupt handler supporting background, foreground, and user-defined modes.
Modified scheduler: priority-boosted, with delayed-fork execution and per-process time limits.
Page-swapping subsystem backed by a real disk swap partition with adaptive replacement.

Mar 2025

Parallel Programming

Parallel Matrix Modification on CUDA

A CUDA implementation that rearranges every element of an N×M matrix to the maximum of its top-left sub-matrix, in O(log n) parallel steps.

COL380 · A3

Reduced the problem to counting sort + parallel prefix sum + binary search, with all three phases on the GPU.
Implemented a work-efficient Blelloch scan for the prefix-sum step — O(log n) parallel steps versus O(n) sequential.
Pinned host memory and multiple CUDA streams to overlap host ↔ device transfers with kernel execution.
Benchmarked on matrices up to 10⁵ × 10⁵ with element ranges to 10⁸; reported execution-time scaling against matrix size and batch count.

Oct — Nov 2024

Computer Networks

TCP-like Reliability over UDP

A reliable file-transfer protocol built on UDP — application-layer ACKs, retransmission, TCP Reno + CUBIC congestion control, evaluated on Mininet.

Prof. Tarun Mangla

ACK-based reliability over UDP: cumulative ACKs, retransmissions, and fast recovery driven by sequence numbers.
TCP Reno congestion control — slow start, congestion avoidance, and timeout behaviour for throughput optimisation.
TCP CUBIC implementation; head-to-head fairness / efficiency comparison against Reno under low- and high-latency regimes.
Mininet + Ryu controller experiments quantifying throughput against packet loss and RTT.

Sep — Oct 2024

Computer Networks

Software-Defined Networking with Ryu

Four OpenFlow controllers on the Ryu framework — from a self-learning switch to a congestion-aware shortest-path router — evaluated on Mininet topologies.

Prof. Tarun Mangla · w/ Anubhav Pandey

Self-learning switch + hub controller; verified the learned flow tables and measured 35.6 Gbps throughput on the learning path.
Spanning Tree Protocol implementation to prevent loops on meshed topologies.
Shortest-path routing via Dijkstra over the discovered topology.
Congestion-aware routing using LLDP-based link-delay measurement and dynamic link-cost re-evaluation.

Aug — Sep 2024

Computer Networks

Concurrent Sockets and Server Scheduling

A client–server suite in C probing the effects of packet sizing, concurrency, and centralized versus decentralized request scheduling.

Prof. Tarun Mangla · w/ Anubhav Pandey

Measured completion time as a function of packet size and client concurrency (1 – 32 concurrent clients).
Implemented a “grumpy server” using multiple decentralized access protocols.
Centralized scheduling algorithms (FCFS, round-robin variants) with logged completion-time analysis.

Mar — Apr 2024

Computer Architecture

Cache Hierarchy Simulator

A cycle-accurate cache simulator in C++ supporting every standard cache organisation with configurable replacement and write policies.

Prof. Kolin Paul

Direct-mapped, set-associative, and fully-associative caches with LRU and FIFO replacement.
Write-Through / Write-Back × Write-Allocate / Write-No-Allocate configurations for end-to-end efficiency analysis.
CPU-cycle penalties modelled on cache misses to approximate real memory-system behaviour.
Matplotlib sweeps across cache configurations over real load / store traces.

Jan — Feb 2024

Side project

Stock Trading Platform

A Flask web app for stock screening, multi-equity charting, and pluggable algorithmic-strategy backtesting.

Prof. Huzur Saran

Screening by P/E, P/E/G, EPS, and other fundamentals; per-ticker RSI and VWAP charting.
Multi-stock comparison on a single chart, with both relative and absolute pricing modes.
Strategy framework with MACD, DMA, DMA++, Linear Regression, RSI, and ADX — run concurrently using threads.