A novel reinforcement learning based routing algorithm for energy management in networks

    *Corresponding author: ÇİĞDEM ERİŞ 
  • Underwater wireless sensor networks (UWSNs) have the potential to provide environmental data for various applications, including studies related to environmental changes, early warning systems, and monitoring in industry. Continuous delivery of information in these contexts is paramount. UWSNs comprise the fundamental assets in these applications. However, the peculiar characteristics of underwater require sensor nodes to rely on their limited battery reserves. Consequently, energy management in these networks becomes a critical resource allocation problem within underwater sensor networks. To address this decision making problem, cluster-based network routing protocols have been extensively explored as a technology to minimize network energy consumption. Cluster heads (CHs) are employed to aggregate data and reduce overall energy usage, thus prolonging the network's lifespan. On the other hand, the focus on harvesting energy from ambient resources underwater has gained attention as a means to extend the operational life of sensor nodes in the distributed communication network system. This paper considers the stochastic energy harvesting process at each sensor node, specifically addressing the energy-aware routing problem in underwater acoustic sensor networks (UASNs). The contribution of this work lies in proposing a novel reinforcement learning-based algorithm for determining cluster heads (CHs), which involves not only considering the nodes' positions and residual energy but also accounting for the expected harvested energy. Numerical results validate that our introduced approach significantly decreases energy consumption and substantially extends the network's operational lifetime considerably.

    Mathematics Subject Classification: Primary: 90B50, 68M12, 68M14, 68M18, 68M20; Secondary: 90B18, 68M10, 91B32.


  • Figure 1.  Sample 3D network architecture showing surface gateway and node positions

    Figure 2.  Timeline showing round based data collection in clustered networks

    Figure 3.  Simulated Vibrations Piezoelectric Harvested Power

    Figure 4.  Number of alive nodes versus simulation rounds of different cluster head selection policy

    Figure 5.  Lifetime comparison of different cluster head selection policy in simulation time

    Figure 6.  Number of alive nodes per data items received

    Figure 7.  Number of data signals received over time for k-means clustering with different cluster head selection policy

    Table 1.  Comparison of Related Research Studies

    Related Studies Distributed Energy Harvesting (EH) Clustering Machine Learning Methods Stochastic EH
    [25], [47]
    [8], [9]
    [20], [14]
    [31], [30], [28], [11]
    [6], [35], our previous work [10]
    Our Work
    Table 4.  Simulation Parameters

    Network Parameters
    $ N_{node} $ Number of nodes 100
    $ M $ Network Size ($ m^3 $) $ 250\times250\times250 $
    $ h $ Network Depth ($ mt $) 250
    $ S_{(x,y,z)} $ Sink Location (125,125, 0)
    $ T_{round} $ Round Duration (seconds) 500
    $ T_{frame} $ Frame Duration (seconds) 25
    Communication Channel Parameters
    R Data rate (kbps) 2
    $ E_0 $ Initial Energy (kJ) 1
    $ P_{rx} $ Reception Power (W) 1.3
    $ P_{idle} $ Idle Power (mW) 2.5
    f Frequency (kHz) 25
    $ p_{ij}^{b} $ Bit error rate on link-(i, j) $ 10^{-10} $
    L Data-Packet size in single frame (bits) 1024
    $ \kappa $ Spreading Factor 1.5
    $ B_N $ Noise Bandwidth (kHz) 1
    $ E_{DA} $ Data Aggregation Energy (nJ/bit/packet) 5
    Piezoelectric Harvester Parameters
    $ L $ Cantilever length (mm) 50
    $ B $ Cantilever width (mm) 3
    $ \varepsilon_r $ Relative permittivity 4000
    $ \varepsilon_0 $ Absolute permittivity ($ F/m $) $ 8.85\times10^{-12} $
    $ T_{pzt} $ Thickness ($ \mu m $) 60
    $ d_{31}^p $ Piezoelectric constant ($ C/N $) $ 300\times10^{-12} $
    $ S_t $ Strouhal number 0.2
    $ \lambda $ Mean Event Rate 5
    $ v_f $ Fluid velocity (m/s) 0.65
    Table 2.  Energy State, $ State_{E_{ratio_n}} $

    Energy State
    0 $ E_{res}(t-1)> 0.9 $
    1 $ E_{res}(t-1)> 0.5 $
    2 $ E_{res}(t-1)> 0.2 $
    3 $ E_{res}(t-1)> 0.1 $
    4 $ E_{res}(t-1)< 0 $
    Table 3.  Harvested Energy State, $ State_{EH_n} $

    Energy Harvesting State
    0 $ Harv_{(t-1)}< 0.01 $
    1 $ Harv_{(t-1)} \le 0.03 $
    2 $ Harv_{(t-1)}> 0.03 $
