Smart Grid Energy Management: Blockchain-based P2P Energy Trading and Reinforcement Learning-based EV Charging



Journal Title

Journal ISSN

Volume Title



The adoption of distributed energy resources (DERs) is increasingly being promoted worldwide to address the ongoing energy crisis as well as to reduce the dependency on fossil fuels for energy generation. However, this microgeneration is dispersed and decentralized which demands for an efficient energy management scheme. In this research we pursue two projects with the aim of providing energy management and demand response that will consequently facilitate DERs integration and minimize demand-supply imbalances while maintaining consumers interests through incentives and revenue generation. Firstly, we analyze the peer-to-peer (P2P) energy trading concept and develop a blockchain-based platform for realizing seamless transaction of energy. Secondly, we developed a deep reinforcement learning (DRL) based electric vehicle (EV) charging control algorithm with the objectives of energy cost minimization and demand satisfaction. A secured and scalable P2P energy trading platform can facilitate the integration of renewable energy and thus contribute to building sustainable energy infrastructure. The decentralized architecture of blockchain makes it a befitting candidate to actualize an efficient P2P energy trading market. However, for a sustainable and dynamic blockchain-based P2P energy trading platform, few critical aspects such as security, privacy and scalability need to be addressed with high priority. This research proposes a blockchain-based solution for energy trading among the consumers which ensures the systems' security, protects user's privacy, and improves the overall scalability. More specifically, we develop a multilayered semi-permissioned blockchain-based platform to facilitate energy transaction. The practical byzantine fault tolerant algorithm (PBFT) is employed as the underlying consensus for verification and validation of transactions which ensures the system's tolerance against internal error and malicious attacks. Additionally, we introduce the idea of quality of transaction (QoT) - a reward system for the participants of the network that eventually helps determine the participant's eligibility for future transactions. The resiliency of the framework against the transaction malleability attack is demonstrated with two uses cases. Finally, a qualitative analysis is presented to indicate the system's usefulness in improving the overall security, privacy, and scalability of the network. The transportation industry is rapidly transitioning from Internal Combustion Engine based vehicles to EVs to promote clean energy. However, large-scale adoption of EVs can compromise the reliability of the power grid by introducing large uncertainty in the demand curve. Demand response with a controlled charge scheduling strategy for EVs can mitigate such issues. In this part of the research, a DRL-based charge scheduling strategy is developed for individual EVs by considering user’s dynamic driving behavior and charging preferences. The time-varying nature of user’s anxiety about charging the EV battery is rigorously addressed. A dynamic weight allocation technique is applied to automatically tune user’s priority for charging and cost-saving with respect to charging duration. The sequential charging control problem is formulated as a Markov decision process, and an episodic approach to the deep deterministic policy gradient algorithm with target policy smoothing and delayed policy update techniques is applied to develop the optimal charge scheduling strategy. A real-world dataset that includes user’s driving behavior, such as arrival time, departure time, and charging duration, is utilized for training and testing. The extensive simulation results reveal the effectiveness of the proposed algorithm in minimizing energy cost while satisfying user’s charging requirements.