Reinforcement Learning (RL)-Based Energy Efficient Resource Allocation for Energy Harvesting-Powered Wireless Body Area Network
Abstract
:1. Introduction
- We consider a resource allocation problem for EH-WBANs with the goal of maximizing the average energy efficiency of body sensors. The resource allocation problem jointly considers the transmission mode, relay selection, allocated time slots, transmission power, and energy status to make the optimal allocation decision;
- We formulate the energy efficiency problem to be a discrete-time and finite-state Markov decision process (DFMDP) and a modified Q-learning algorithm, which reduces the state-action space in the original Q-learning algorithm, is proposed to solve the modeled problem;
- From the numerical analysis, we show that the proposed scheme can obtain the best energy efficiency and with the more rapid convergence speed by eliminating the irrelevant exploration space in the Q-table as compared with the classical Q-learning algorithm.
2. Network Model Descriptions
2.1. Network Model
2.2. Data Transmission Model
2.3. Data Serving Model
2.4. Energy Harvesting Model
2.5. Energy Efficiency Model
3. Problem Formulation and Optimization Algorithm
3.1. DFMDP Model
- The state of each body sensor in the k-th time slot can be denoted as . In this model, contains two parts: and . They are the data and energy queue lengths of the n-th body sensor at the beginning of the k-th time slot, respectively. To ensure the completeness of the exploration of state space, and are specified to be an integer and take the values of and , respectively.
- The action a in this scenario should be the resource allocation variables, which include transmission mode , time slot allocation , relay selection , and power allocation . To make sure the integrity of the exploration of action space, , , and should be subject to the maximum transmission power .
- Obviously, the reward r is the immediate reward corresponding to current state–action pair, which is given by Equation (20).
Algorithm 1 The Q-learning based resource allocation algorithm |
|
3.2. The Proposed Modified Q-Learning Algorithm
4. Simulation Results and Analysis
4.1. Simulation Setting
4.2. Results and Analysis
4.2.1. Comparison between the Proposed Algorithm and Classical Q-Learning Algorithm
4.2.2. The Influence of the Number of Body Sensors Deployed
4.2.3. The Standard Deviation of Consumed Energy of Each Body Sensor
4.2.4. The Influence of Energy Harvesting Rate and Data Arrival Rate
5. Conclusions
Author Contributions
Funding
Conflicts of Interest
References
- Chen, M.; Gonzalez, S.; Vasilakos, A.; Cao, H.; Leung, V.C.M. Body area networks: A survey. Mob. Netw. Appl. 2011, 16, 171–193. [Google Scholar] [CrossRef]
- Marwa, S.; Ahmed, A.D.; Imed, R. Wireless Body Area Network (WBAN): A survey on reliability, fault tolerance, and technologies coexistence. ACM Comput. Surv. 2017, 50, 3. [Google Scholar]
- Dagdeviren, C.; Li, Z.; Wang, Z.L. Energy harvesting from the animal/human body for self-powered electronics. Annu. Rev. Biomed. Eng. 2017, 19, 85–108. [Google Scholar] [CrossRef]
- Chakraborty, C.; Gupta, B.; Ghosh, S.K. A review on telemedicine-based WBAN framework for patient monitoring. Telemed. J. E-Health 2013, 19, 619–626. [Google Scholar] [CrossRef] [PubMed]
- Elli, K.; Aris, S.L.; Angelos, A.; Stefano, T.; Marco, D.R.; Luis, A.; Christos, V. A survey on M2M systems for mHealth: A wireless communications perspective. Sensors 2014, 14, 18009–18052. [Google Scholar]
- Esteves, V.; Angelos, A.; Elli, K.; Manel, P.V.; Pere, M.C.; Christos, V. Cooperative energy-harvesting-adaptive MAC protocol for WBANs. Sensors 2015, 15, 12635–12650. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Salayma, M.; Ahmed, A.D.; Romdhani, I.; Nasser, Y. Reliability and energy efficiency enhancement for emergency-aware wireless body area networks (WBANs). IEEE Trans. Green Commun. Netw. 2018, 2, 804–816. [Google Scholar] [CrossRef]
- Liu, Z.; Liu, B.; Chen, C.W. Transmission-rate-adaption assisted energy-efficient resource allocation with QoS support in WBANS. IEEE Sens. J. 2017, 17, 5767–5780. [Google Scholar] [CrossRef]
- Chen, G.H.; Zhan, Y.J.; Sheng, G.Y.; Xiao, L.; Wang, Y.H. Reinforcement learning-based sensor access control for WBANs. IEEE Access 2018, 7, 8483–8494. [Google Scholar] [CrossRef]
- Roy, M.; Chowdhury, C.; Aslam, N. Designing transmission strategies for enhancing communications in medical IoT using Markov decision process. Sensors 2018, 18, 4450. [Google Scholar] [CrossRef] [Green Version]
- Ibarra, E.; Angelos, A.; Elli, K.; Christos, V. HEH-BMAC: Hybrid polling MAC protocol for wireless body area networks operated by human energy harvesting. Telecommun. Syst. 2015, 58, 111–124. [Google Scholar] [CrossRef] [Green Version]
- Min, M.H.; Wan, X.Y.; Xiao, L.; Chen, Y.; Xia, M.H.; Wu, D.; Dai, H.Y. Learning-based privacy-aware offloading for healthcare IoT with energy harvesting. IEEE Int. Things J. 2019, 6, 4307–4316. [Google Scholar] [CrossRef]
- Ahmed, I.; Ikhlef, A.; Schober, R.; Mallik, R.K. Power allocation for conventional and buffer-aided link adaptive relaying systems with energy harvesting nodes. IEEE Trans. Wirel. Commun. 2014, 13, 1182–1195. [Google Scholar] [CrossRef]
- Wang, H.Q.; Chi, X.F.; Zhao, L.L. Energy adaptive MAC protocol for IEEE 802.15.7 with energy harvesting. Optoelectr. Lett. 2016, 12, 370–374. [Google Scholar] [CrossRef]
- Harb, A. Energy harvesting: State-of-the-art. Renew. Energy 2011, 36, 2641–2654. [Google Scholar] [CrossRef]
- Paradiso, J.A.; Starner, T. Energy scavenging for mobile and wireless electronics. IEEE Pervasive Comput. 2005, 4, 18–27. [Google Scholar] [CrossRef]
- Ku, M.L.; Wei, L.; Yan, C.; Liu, K.J.R. Advances in energy harvesting communications: Past, present, and future challenges. IEEE Commun. Surv. Tutor. 2016, 18, 1384–1412. [Google Scholar] [CrossRef]
- Gao, H.H.; Ejaz, W.; Jo, M. Cooperative wireless energy harvesting and spectrum sharing in 5G networks. IEEE Access 2016, 4, 3647–3658. [Google Scholar] [CrossRef]
- Altinel, D.; Kurt, G.K. Modeling of hybrid energy harvesting communication systems. IEEE Trans. Green Commun. Netw. 2019, 3, 523–534. [Google Scholar] [CrossRef]
- Azmat, F.; Chen, Y.; Stocks, N. Predictive modelling of RF energy for wireless powered communications. IEEE Commun. Lett. 2015, 20, 173–176. [Google Scholar] [CrossRef]
- Fan, D.; Lopez Ruiz, L.; Gong, J.; Lach, J. Ehdc: An energy harvesting modeling and profiling platform for body sensor networks. IEEE J. Biomed. Health Inf. 2018, 22, 33–39. [Google Scholar] [CrossRef]
- Kansal, A.; Hsu, J.; Zahedi, S.; Srivastava, M.B. Power management in energy harvesting sensor networks. ACM Trans. Embed. Comput. Syst. 2007, 6, 32. [Google Scholar] [CrossRef]
- Demir, S.M.; Fadi, A.T.; Ali, M. Energy scavenging methods for WBAN applications: A review. IEEE Sens. J. 2018, 18, 6477–6488. [Google Scholar] [CrossRef]
- Mekikis, P.; Angelos, A.; Elli, K.; Nikos, P.; Luis, A.; Christos, V. Stochastic modeling of wireless charged wearables for reliable health monitoring in hospital environments. In Proceedings of the IEEE International Conference on Communications (ICC), Paris, France, 21–25 May 2017. [Google Scholar]
- Ling, Z.; Hu, F.Y.; Wang, L.H.; Yu, J.C.; Liu, X.L. Point-to-point wireless information and power transfer in WBAN with energy harvesting. IEEE Access 2017, 5, 8620–8628. [Google Scholar] [CrossRef]
- Mohammadi, M.S.; Zhang, Q.; Dutkiewicz, E.; Huang, X. Optimal frame length to maximize energy efficiency in IEEE 802.15.6 uwb body area networks. IEEE Wirel. Commun. Lett. 2014, 3, 397–400. [Google Scholar] [CrossRef]
- He, Y.; Zhu, W.; Guan, L. Optimal resource allocation for pervasive health monitoring systems with body sensor networks. IEEE Trans. Mob. Comput. 2011, 10, 1558–1575. [Google Scholar] [CrossRef] [Green Version]
- Liu, Z.; Liu, B.; Chen, C.; Chen, C.W. Energy-efficient resource allocation with QoS support in wireless body area networks. In Proceedings of the IEEE Global Communications Conference, San Diego, CA, USA, 6–10 December 2015. [Google Scholar]
- Jung, B.H.; Akbar, R.U.; Sung, D.K. Throughput, energy consumption, and energy efficiency of IEEE 802.15.6 body area network (BAN) MAC protocol. In Proceedings of the IEEE International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC), Sydney, NSW, Australia, 9–12 September 2012. [Google Scholar]
- Qiu, J.; Lin, B.; Liu, P.; Zhang, S.; Dai, G. Energy level based transmission power control scheme for energy harvesting WSNs. In Proceedings of the IEEE Global Communications Conference, Houston, TX, USA, 5–9 December 2011. [Google Scholar]
- Niyato, D.; Hossain, E.; Rashid, M.M.; Bhargava, V.K. Wireless sensor networks with energy harvesting technologies: A game-theoretic approach to optimal energy management. IEEE Wirel. Commun. 2007, 14, 90–96. [Google Scholar] [CrossRef]
- Leng, S.; Yener, A. Resource allocation in body area networks for energy harvesting healthcare monitoring. In Handbook of Large-Scale Distributed Computing in Smart Healthcare; Springer: Cham, Switzerland, 2017; pp. 553–587. [Google Scholar]
- Akhtar, F.; Rehmani, M.H. Energy harvesting for self-sustainable wireless body area networks. IT Prof. 2017, 19, 32–40. [Google Scholar] [CrossRef]
- Wei, S.; Guan, W.; Liu, K.J.R. Power scheduling for energy harvesting wireless communications with battery capacity constrain. IEEE Trans. Wirel. Commun. 2015, 14, 4640–4653. [Google Scholar] [CrossRef]
- Ibarra, E.; Antonopoulos, A.; Kartsakli, E.; Rodrigues, J.J.P.C.; Verikoukis, C. Qos-aware energy management in body sensor nodes powered by human energy harvesting. IEEE Sens. J. 2015, 16, 542–549. [Google Scholar] [CrossRef] [Green Version]
- IEEE Standard for Local and Metropolitan Area Networks—Part 15.6: Wireless Body Area Networks; IEEE: New York, NY, USA, 2012.
- Quwaider, M.; Rao, J.; Biswas, S. Body-posture-based dynamic link power control in wearable sensor networks. IEEE Commun. Mag. 2010, 48, 134–142. [Google Scholar] [CrossRef]
- Mitran, P. On optimal online policies in energy harvesting systems for compound poisson energy arrivals. In Proceedings of the IEEE International Symposium on Information Theory, Cambridge, MA, USA, 1–6 July 2012. [Google Scholar]
- Reusens, E.; Joseph, W.; Braem, B.; Tanghe, E.; Martens, L.; Moerman, I.; Blondia, C. Characterization of on-body communication channel and energy efficient topology design for wireless body area networks. IEEE Trans. Inf. Technol. Biomed. 2009, 3, 933–945. [Google Scholar] [CrossRef] [PubMed]
- D’Errico, R.; Ouvry, L. A statistical model for on-body dynamic channels. Int. J. Wirel. Inf. Netw. 2010, 17, 92–104. [Google Scholar] [CrossRef]
- Baxter, L. Markov decision processes: Discrete stochastic dynamic programming. Technometrics 1995, 37, 353. [Google Scholar] [CrossRef]
- Luong, C.N.; Hoang, D.T.; Gong, S.; Niyato, D.; Wang, P.; Liang, C.Y.; Kim, D.I. Applications of deep reinforcement learning in communications and networking: A survey. IEEE Commun. Surv. Tutor. 2019, 21, 3133–3174. [Google Scholar] [CrossRef] [Green Version]
- Quah, H.K.; Quek, C. MCES: A novel Monte Carlo evaluative selection approach for objective feature selections. IEEE Trans. Neural Netw. 2007, 18, 431–448. [Google Scholar] [CrossRef]
- Caarls, W.; Schuitema, E. Parallel online temporal difference learning for motor control. IEEE Trans. Neural Netw. Learn. Syst. 2016, 27, 1457–1468. [Google Scholar] [CrossRef]
- Cai, X.; Zheng, J.; Zhang, Y. A Graph-coloring based resource allocation algorithm for D2D communication in cellular networks. In Proceedings of the IEEE International Conference on Communications (ICC), London, UK, 8–12 June 2015. [Google Scholar]
- Fedrizzi, R.; Goratti, L.; Sithamparanathan, K.; Rasheed, T. A Heuristic Approach to Mobility Robustness in 4G LTE Public Safety Networks. In Proceedings of the IEEE Wireless Communications and Networking Conference, Doha, Qatar, 3–6 April 2016. [Google Scholar]
- Sandhu, M.M.; Javaid, N.; Akbar, M.; Najeeb, F.; Qasim, U.; Khan, Z.A. FEEL: forwarding data energy efficiently with load balancing in wireless body area networks. In Proceedings of the IEEE International Conference on Advanced Information Networking and Applications, Victoria, BC, Canada, 13–16 May 2014. [Google Scholar]
- Zhang, Y.; Fu, F.; Van der Schaar, M. On-line learning and optimization for wireless video transmission. IEEE Trans. Signal Process. 2010, 58, 3108–3124. [Google Scholar] [CrossRef]
Symbol | Definition |
---|---|
H | Hub |
n-th body sensor | |
Time slot | |
k-th time slot | |
Transmission mode of n-th body sensor | |
k-th time slot is assigned to the n-th body sensor for direct transmission | |
k-th time slot is allocated to n-th body sensor for transmitting data to m-th body sensor. | |
m-th body sensor forwards the data from n-th body sensor to the hub at the k-th time slot. | |
Data rate of the n-th body sensor | |
Data rate of the n-th body sensor in direct transmission mode | |
Data rate of the n-th body sensor in cooperative transmission mode | |
SINR of n-th body sensor in k-th time slot in direct transmission mode | |
SINR of the source-relay link in k-th time slot in cooperative transmission mode | |
SINR of the relay-hub link in k-th time slot in cooperative transmission mode | |
Transmission power of the n-th body sensor in the k-th time slot in direct transmission mode | |
Transmission gain between the n-th body sensor and hub | |
Transmission power of n-th body sensor in the k-th time slot to m-th body sensor in cooperative transmission mode | |
Transmission gain between the n-th body sensor and m-th body sensor in cooperative transmission mode | |
Transmission power of m-th body sensor in the k-th time slot to hub in cooperative transmission mode | |
Transmission gain between the m-th body sensor and hub in cooperative transmission mode | |
Noise power | |
Date rate of source-relay link in in cooperative transmission mode | |
Date rate of relay-hub link in in cooperative transmission mode | |
Data queue length at the n-th body sensor in time slot k | |
Maximum traffic queue length of body sensors | |
Arriving traffic packets of n-th body sensor in time slot k − 1 | |
Energy queue length at the n-th body sensor in time slot k | |
Maximum energy queue length of body sensors | |
Amount of energy harvested by n-th body sensor in time slot k − 1 | |
Date packet size | |
Energy packet size | |
Maximum transmission power of body sensors | |
Energy efficiency of n-th body sensor in time slot k |
The State Space If Needs to Be Explored | |
---|---|
{0, 0} | No |
{0, *} | No |
{*, 0} | No |
{*, *} | Yes |
Modified Q-Learning Algorithm | Classical Q-Learning Algorithm |
---|---|
xy | x3y2 |
xy | x5y3 |
- | - |
xy | x(2n+1)y(n+1) |
Parameters | Value |
---|---|
R | 10 m |
Distance of each body sensor | Random distributed in (2, 5) m |
(1:1:10) | |
B | 1 MHz |
−94 dBm/Hz | |
10 dBm | |
(1:1:8) packet/time slot | |
(1:1:8) packet/time slot | |
200 | |
0.5 ms | |
8 bits/packet | |
0.0002 J/packet | |
50 packets | |
50 packets |
© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Xu, Y.-H.; Xie, J.-W.; Zhang, Y.-G.; Hua, M.; Zhou, W. Reinforcement Learning (RL)-Based Energy Efficient Resource Allocation for Energy Harvesting-Powered Wireless Body Area Network. Sensors 2020, 20, 44. https://doi.org/10.3390/s20010044
Xu Y-H, Xie J-W, Zhang Y-G, Hua M, Zhou W. Reinforcement Learning (RL)-Based Energy Efficient Resource Allocation for Energy Harvesting-Powered Wireless Body Area Network. Sensors. 2020; 20(1):44. https://doi.org/10.3390/s20010044
Chicago/Turabian StyleXu, Yi-Han, Jing-Wei Xie, Yang-Gang Zhang, Min Hua, and Wen Zhou. 2020. "Reinforcement Learning (RL)-Based Energy Efficient Resource Allocation for Energy Harvesting-Powered Wireless Body Area Network" Sensors 20, no. 1: 44. https://doi.org/10.3390/s20010044
APA StyleXu, Y. -H., Xie, J. -W., Zhang, Y. -G., Hua, M., & Zhou, W. (2020). Reinforcement Learning (RL)-Based Energy Efficient Resource Allocation for Energy Harvesting-Powered Wireless Body Area Network. Sensors, 20(1), 44. https://doi.org/10.3390/s20010044