Student IT/EE Workshop 2025

Name: Student IT/EE Workshop 2025
Start: 2025-04-24T09:00:00+02:00
End: 2025-04-24T13:10:00+02:00
Location: Stara Kotłownia

24 April 2025

Stara Kotłownia

Europe/Warsaw timezone

Accelerating AI Inference in the Browser with WebGPU: Evaluating Quantization Trade-offs in Latency, Quality, and Memory Usage

24 Apr 2025, 10:45

30m

SK 04/05 (Stara Kotłownia)

SK 04/05

Stara Kotłownia

Warsaw University of Technology, Main Campus

Poster Session B (Poster)

Mr Ignacy Ruszpel (Politechnika Warszawska - Wydział Elektryczny)Mr Nikodem Wójcik (Politechnika Warszawska - Wydział Elektryczny)

Recent advances in deep learning and natural language processing have spurred the demand for deploying increasingly complex models on resource-constrained platforms. Modern browser environments, empowered by emerging GPU standards like WebGPU, now offer a promising venue for real-time AI inference. This paper provides an overview of leveraging WebGPU for accelerating inference directly within the browser, with a focus on evaluating the trade-offs associated with various quantization schemes. Our study examines the impact of quantization on inference latency, model quality, and memory usage across several model variants. Preliminary benchmarks demonstrate that carefully applied quantization can substantially reduce resource demands while maintaining acceptable performance, laying the groundwork for further optimization of browser-based AI applications. This work sets the stage for future explorations aimed at refining quantization techniques and expanding the capabilities of WebGPU-driven inference.

Mr Ignacy Ruszpel (Politechnika Warszawska - Wydział Elektryczny) Mr Nikodem Wójcik (Politechnika Warszawska - Wydział Elektryczny)

Ruszpel-Wojcik-Poster.pdf

Student IT/EE Workshop 2025

Accelerating AI Inference in the Browser with WebGPU: Evaluating Quantization Trade-offs in Latency, Quality, and Memory Usage

SK 04/05

Stara Kotłownia

Speakers

Description

Authors

Presentation materials

Choose timezone

Student IT/EE Workshop 2025

Speakers

Description

Authors

Presentation materials