Symmetry, Vol. 15, Pages 1170: NOMA Resource Allocation Method Based on Prioritized Dueling DQN-DDPG Network

10 months ago 31

Symmetry, Vol. 15, Pages 1170: NOMA Resource Allocation Method Based on Prioritized Dueling DQN-DDPG Network

Symmetry doi: 10.3390/sym15061170

Authors: Yuan Liu Yue Li Lin Li Mengli He

To address the need for massive connections in Internet-of-Vehicle communications, local wireless networks utilize non-orthogonal multiple access (NOMA). Scholars have introduced deep reinforcement learning networks for user grouping and power allocation to reduce computational complexity. However, the traditional algorithm based on DQN (Deep Q-Network) still exhibits slow convergence speed and low training stability, while the uniform sampling method in the sample playback process suffers from low sampling efficiency. In order to address these issues, this paper proposes a user grouping and power allocation method for NOMA systems based on Prioritized Dueling DQN-DDPG joint optimization. Firstly, the paper introduces the user grouping network based on Dueling DQN, which considers both the state value and action value in the entire connection layer. The two values compete with each other, are summed up, and re-evaluated. The network significantly improves training stability and increases the convergence speed. Secondly, in this paper, a depth deterministic strategy gradient (DDPG) algorithm with symmetric properties is used. This algorithm works well for continuous action spaces and avoids the power quantization error because of the continuity of power value in the power allocation stage. Finally, the priority sampling based on TD-error (Temporal-difference error) is combined with the Dueling DQN network and DDPG network to ensure random sampling and improve the replay probability of important samples. Simulation results show that the proposed priority-based Dueling DQN-DDPG algorithm significantly improves the convergence speed of sample training. The research results of this paper provide a solid foundation for the following research content, which focuses on NOMA system resource allocation under the mobile user state.

Read Entire Article