Modern power systems require accurate and uncertainty-aware forecasts of electricity demand to maintainstabilityandoptimize resource scheduling. Traditional models such as ARIMA and recurrent neural networks (RNNs) often struggle tocapturelongrange temporal dependencies and to quantify prediction uncertainty. This paper presents a Transformer-based forecastingframeworkthatintegrates self-attention mechanisms with probabilistic quantile regression to model sequential dependencies in short-termandmediumterm energy-load data. Historical consumption, weather, and calendar attributes are pre-processed through regime-aware segmentationtoenhance robustness across seasonal and behavioral variations. The proposed system generates both point and interval forecasts—providing operators with calibrated P10, P50, and P90 values—while simultaneously benchmarking against ARIMA, RandomForest,andGradient Boosting models using deterministic (MAPE, RMSE, MAE) and probabilistic (CRPS, PICP) metrics. Expectedresultsindicatea 20–25 % improvement in forecasting accuracy and a reduction of MAPE below 2.5 % for day-ahead and 5 %for week-aheadhorizons.The unified framework demonstrates how Transformer architectures can advance data-driven energy analytics throughreliable,explainable, and risk-sensitive forecasting. Keywords: Transformer • Self-Attention • Probabilistic Forecasting • Electricity Load Forecasting • Quantile Regression• SmartGridAnalytics • MAPE • CRPS • Uncertainty Quantification