Abstract: The current LTE and LTE-A deployments require larger efforts to achieve the radio resource management. This, due to the increase of users and the constantly growing demand of services. For this reason, the automatic optimization is a key point to avoid issues such as the inter-cell interference. This paper presents several proposals of machine-learning algorithms focused on this automatic optimization problem. The research works seek that the cellular systems achieve their self-optimization, a key concept within the self-organized networks, where the main objective is to achieve that the networks to be capable to automatically respond to the particular needs in the dynamic network traffic scenarios.
Keywords:Machine learningMachine learning, self-organization self-organization, ICIC ICIC, LTE LTE.
Resumen: Los despliegues actuales de LTE y LTE-A requieren mayor esfuerzo para la gestión de recursos radio debido al incremento de usuarios y a la gran demanda de servicios; en ese escenario, la optimización automática es un punto clave para evitar problemas como la interferencia inter-celda. El presente trabajo recopila propuestas de algoritmos de aprendizaje automático [machine learning] enfocados en resolver este problema. Las investigaciones buscan que los sistemas celulares consigan su auto-optimización, un concepto que se enmarca dentro del área de redes auto-organizadas [Self-Organized Networks, SON], cuyo objetivo es lograr que las redes respondan de forma automática a las necesidades de los escenarios dinámicos de tráfico de red.
Palabras clave: Aprendizaje automático, auto-organización, ICIC, LTE.
Resumo: As implantações atuais de LTE e LTE-A exigem maior esforço para o gerenciamento de recursos rádio devido ao aumento de usuários e à alta demanda por serviços, neste cenário a otimização automática é um ponto-chave para evitar problemas como a interferência entre células. O presente trabalho coleta propostas de algoritmos de aprendizado automáticos focados na resolução deste problema. A pesquisa busca que os sistemas celulares alcancem a sua auto-otimização, um conceito que faz parte das redes auto-organizadas (Self-Organizing Networks, SON), cujo objetivo é garantir que as redes respondam automaticamente às necessidades dos cenários dinâmicos do tráfego de rede.
Palavras-chave: Aprendizado de máquina, redes auto-organizadas, ICIC, LTE.
State of the Art
Machine Learning Algorithms for Inter-Cell Interference Coordination
Algoritmos de aprendizaje automático para coordinación de interferencia inter-celda
Algoritmos de aprendizado de máquina para coordenação de interferência entre células

Received: 10 May 2018
Accepted: 26 June 2018
The amount of User Equipment [UE] within the large and dense mobile networks has suffered a considerable increase in the latest years (Fernández, González, & Hernández, 2014) due to the considerable data demand caused by the current requirements of the mobile services (Borkar & Pande, 2016). In order to support these bandwidth demands, standards such as the so-called Long-Term Evolution [LTE] and Long-Term Evolution – Advanced [LTE-A] are required to perform an adequate radio resource management (Kibria, Villardi, & Nguyen, 2016; Hu & Pang, 2012) and to improve their optimization functions. This, seeking to provide a suitable response to the needs of the high traffic density scenarios (Guio & Hernández, 2013).
There are several alternatives to optimize the radio resources allocation, covering from the low-power cells deploy or Heterogeneous Networks [HetNet] (Hu & Pang, 2012) towards intelligent resource allocation schemes (Glenn, Imran, & Evans, 2013). Nevertheless, some lacks are present in these proposals since they are not able to adapt themselves dynamically and efficiently to the scenarios with variable network traffic; this entails problems such as the Inter-Cell Interference [ICI] (Fernández et al., 2014).Within the alternatives to achieve dynamic adaptation, the machine learning techniques involving the Self Organization Networks [SON] are present (Van den Berg et al., 2008). These ones include areas such as the self-optimization via decision algorithms, applicable to the frequency reuse to reduce the ICI, making to achieve the automation through the learning over previous actions (5GAmericas, 2013).
In the mobile networks field, the SON concept has been considered by organizations such as the 3rd Generation Partnership Project [3GPP] (Feng, 2008) and the Next Generation Mobile Networks [NGMN] (Behjati, 2014).Within the 3GPP, some SON functionalities are added from the release 8 in environments with several providers (3GPP, 2014), until release 10 & 11 where more advanced features such as HetNets, enhanced Inter-cell Interference Coordindation [eICIC], and coordination between SON functions —among others— are supported (5GAmericas, 2013). For the particular case of the NGMN, some use cases for SON are defined (Glenn et al., 2013).
The desired automation to classify a network as self-organized can be achieved through Machine Learning [ML] algorithms (Jiang et al., 2017), which consists of a series of tools widely used nowadays due to their ability to perform predictions from a set of abstract rules. When the ML techniques are combined with the current data analysis capacities and computing power, their predictive capacity is widely enhanced; this entails the achievement of adequate results (Rayón, 2017). ML defines several approaches included in the current paper and applicable to the self-optimization context. The most relevant are: reinforced learning (Moysen, Giupponi, Carl, & Gauss, 2014), Q-Learning (Kumar, Kanagaraj, & Srilakshmi, 2013; Gadam, Ahmed, & Kyun, 2016), statistical learning (Sierra & Marca, 2015), and pattern classification (Thilina, Choi, Saquib, & Hossain, 2013), among others.
The present paper gathers the basic concepts of some of the most employed ML algorithms in ICIC, both for LTE/LTE-A and HetNets. These works are focused on achieving self-organized systems. We describe the structure and order of this document in the following section.
We started by doing a brief revision of the LTE channel model and the general frequency reuse process, such as the classification of the frequency reuse techniques —including fractional techniques— to gather a larger understanding of the interference problem in the resource allocation. After, we include a classification of the most employed machine learning techniques, not only in LTE/LTE-A, but also in heterogeneous networks applicable to the ICI. The literature review was performed in research works of the last four years including specialized proposals seeking for self-organization through ML algorithms. Finally, we present some conclusions that might be useful as a reference in the area.
Two main interferences are present in LTE/LTE-A networks: intra-cell and inter-cell. The first one is produced between frequency channels within the same cell and thanks to OFDMA and its series of subcarriers (see Figure 1), its impact is low (Fernández et al., 2014). Alternatively, the ICI is produced between a frequency channel in a cell and the same channel used in an adjacent cell. LTE/LTE-A still have the challenge to solve this issue, particularly in the edges of the cells (Abusaid & Salem, 2017) because the frequency reuse is 1.

In Figure 2, we show the scenarios for the ICI between base stations; Figure 2a presents the problem of using the same frequency band in an adjacent cell: an ICI type that can be reduced through the allocation of several frequency bands to the neighbor cells. Figure 2b shows this latter (Hamza, Khalifa, & Hamza, 2013).

For the frequency resource allocation, the ICIC employs the minimal resource allocation unit called Resource Block [RB] (Keysight Technologies, 2018). An RB employs 180 kHz of band, equivalent to 12 subcarriers spaced 15 kHz among them; in an RB, 6 or 7 OFDMA symbols are transmitted. The duration of an RB is 0.5 ms —i.e., the duration of a time slot—. ICIC is a technology seeking to improve the throughput, especially in the cell edges by exchanging information between base stations through the X2 interface (ETSI, 2018). The 3GPP does not specify algorithms or control methods; hence, the service providers can implement their own solutions (Dai & Hiroyuky, 2012).
Within the static models, the optimal values for several parameters such as power, number of sub-bands, and assigned frequencies per cell do not present variations. Hence, these ones are determined in function of environments with fixed traffic patterns. In these models, some conventional frequency planning schemes are included (e.g., reuse 1 and reuse 3, also called hard frequency reuse). Further, other Fractional Frequency Reuse [FFR] schemes are also included (Budihal, Siddamal, & Banakar, 2016), which are described in the following section. Regardless of their differences, these schemes require to specify common parameters such as the channels set (sub-bands) to be used in each cell, the power and region of the cell where the sub-bands are used, and the center and edge of the cell.
This scheme separates the allocated frequency bands to distant areas of a base station by fixing a frequency reuse of 3. For the central regions, the transmission power is reduced and a frequency reuse of 1 is applied (see Figure 2c). By doing this, it is possible to improve the Signal to Interference plus Noise Ratio [SINR] and the throughput for the UE in the cell edges. Some FFR schemes presented by Hamza et al., (2013)are described as follow:
This reuse technique uses two sub-bands having different frequency reuse. A common sub-band from the system bandwidth is used inside the cell (reuse 1), whilst the other part of the bandwidth is divided between the neighbor eNB —similar to the hard frequency reuse where N is the frequency reuse and N > 1—, to create a sub-band with a low ICI level on each sector.
The UE in the center will receive the frequency fragments completely reused, whilst the UE in the edges will receive the orthogonal fragments. This means that the internal UE in a cell do not share any spectrum with the UE in the edges of the second cell, reducing the ICI (see Figure 3)
For this scheme, two versions are available: in the first one, the dedicated sub-band for the edge UE also can be used for the UE in the center of it, but with a reduced power level and only if it is not busy by the edge UE. The sub-band for the center is available only for the central UE (see Figure 4a). The second version consists of the UE in the center without access to the edge sub-band; hence, each cell can use all the system bandwidth while the interference to the neighbor cells is reduced (see Figure 4b).
Xie (2009) describes EFFR as a scheme improving the performance of the previous one, also the system capacity in high-load situations is improved. This scheme defines three reuse types for the neighbor cells and it reserves a part of the complete frequency band (primary segment) for each cell type. Within the different cell types, the band has to be orthogonal. The remaining subchannels are the secondary segment. The primary segment of a cell type is, at the same time, part of the secondary segments belonging to the other two cell types. Each cell can use all the subchannels in a primary segment if required, whilst this cell can only use a part of the subchannels of its secondary segment by considering the interference (see Figure 5). Due to the fact that each eNB needs to know the location of the primary and secondary segments of other cell types, the eNB will calculate them assuming that the configuration is the same for each cell and it will also assume only the bandwidth displacements as different (Mohamed & Abd-Elnaby, 2016).
Within this classification, it is possible to include dynamic schemes with low, intermediate, and high level (Hamza et al., 2013). In the first group, some configuration parameters previously planned and optimized are present for each case, based on the different traffic loads and UE distributions. Hence, the Base Stations [BS] exchange information. For intermediate level schemes, the calculation of the optimal values is done according to the amount and distribution of UE in each cell. This is done from available data of the cells without limiting to use predetermined configurations. In the high-level schemes, an improvement relative to the intermediate level is present due to the way how the optimal values are calculated. These parameters include the power relation, number of sub-bands, and frequencies allocation. Additionally, the number of sub-bands to be assigned to each UE is calculated in function of the channel condition. Variations of FFR schemes can be used also.
This schemes type seeks to explore the ICIC via management and control models and through the coordinated information exchange among cells by achieving adaptable ICIC values to different traffic conditions (Hamza et al., 2013).The schemes can be classified in four categories: centralized, semi-distributed, coordinated-distributed, and autonomously distributed. In the first one, a central control unit is in charge of gathering all the information relative to the channel state and send it in a coordinate manner to each eNodeB to manage the resource allocation. The second one presents a general coordination in two levels, eNodeB and central entity; this allows a more efficient information exchange. In autonomous distributed schemes, the coordination is performed only at the eNB level without the usage of a central entity for coordination. Contrary to the coordinated-distributed schemes, the autonomous distributed do not require intranodal coordination since each eNodeB assigns channels to their UE by using local information. This entails an advantage of eliminating the signaling overhead via a planning algorithm, producing that the decisions can be adapted quickly to the channel features.
The self-organization is a key concept relative to autonomy for the future evolution of the mobile networks to achieve a self-organized ICIC on each eNB, it dynamically must restrict some of their RB through techniques such as the power control. This involves parameters such as the SINR levels in the received RB, where a low SINR level indicates that an RB is being used by a neighbor eNB.
Deb (2014) presents an algorithm to reduce the ICI based on the allocation of Almost Blank Sub-frames [ABS] and in the Cell Selection Bias [CBS] in the LTE standard. The algorithm determines the optimal associations of ABS and UE through two main tasks: determine the amount of radioelectric resources that the macrocells should offer to the picocells and determine the association rules deciding what UE should associate with picocells. In this research work, the author validates its solution by considering the load in the cells, the macro-pico interference maps, and the network topology. Also, the solution is assessed in a real scenario in New York by applying the SON frame. Further, the eICIC is handled through the allocation of ABS, which is a key point in the frequency reuse.
Regarding the use of EFFR schemes, Mohamed and Abd-Elnaby (2016)present an approach called Self-Organized Dynamic Resource Allocation scheme using Enhanced Fractional Frequency Reuse [SODRA-EFFR], which seeks to improve the performance and coverage in the cell edges through the dynamic and self-organized allocation of resources (power and frequency) to the internal and external regions of the cells in networks based on relays for LTE-A retransmissions. In this scheme, both the power and frequency allocation between the eNB and relative to the retransmission in each cell is performed via the coordination of neighbor eNB and relays through the X2 interface. The performance of this approach was assessed via MATLAB simulations with and without relays. Furthermore, the results were compared using different combinations of resources allocation to the internal and external regions of the cells with reuse factors of 1 and 3, such as with their SFR scheme.
Klaine, Imra, Onireti, and Souza (2017) state that —in a general level— the machine learning algorithms are classified in supervised, unsupervised, and Reinforcement Learning [RL] ones (Razavi, Klein, & Claussen, 2010). Within the supervised learning, a supervisor is involved to train the system and normally a historical of the data is employed by applying tags to specific feature; this allows to predict the results. In the unsupervised learning there is no supervisor nor tags in the data to predict results, since the expected data are unknown and the system needs to learn by itself. The RL algorithms operate like the unsupervised ones, with the addition of a recompense mechanism in charge of rewarding of penalizing the system according to the decision quality (good or bad). This rewarding mechanism allows that the RL system updates itself continuously.
In order to provide machine learning to the LTE/LTE-A networks, several techniques are classified in the scientific literature. Some examples of this are the Markov models (Galindo-Serrano & Giupponi, 2013), heuristics (Morozs, Clarke, & Grace, 2015), fuzzy logic (Razavi et al., 2010), and the genetic algorithms (Gao, Chen, & Li, 2014), among others (see Figure 6). However, the evolution of these techniques implies a larger complexity in the data processing to ensure the algorithms operate correctly, area currently in research as per the Big Data (Barranco, 2012), a field encompassing ML.
In this type of learning, the algorithms receive a set of tagged input and output data; from here, a model or function can be generated from the input/output relation. This allows —through a data collection— to feed the model and train the algorithms to improve its predictions. In the LTE context, the supervised learning is a wide domain and it has several algorithms, each one with its proper specifications and applications. Among the application areas, it is possible to outline the load balancing with statistical regression (Sierra & Marca, 2015), power control (Supratim & Monogioudis, 2015), and the dynamic frequency allocation (Li, Peng, Yan, Zhao, & Li, 2016). Within the most common algorithms, we outline the Bayer theorem and the linear regression, which are described in the following sections.
The Bayes theorem suggested in Klaine et al., (2017)is an important rule in the probability and statistics areas for the analysis of conditional probabilities, i.e., to understand how the probability of a hypothesis (h) is modified by a new evidence (e). The Bayes theorem is presented in Equation (1).

where:
P(h/e) is the probability of the hypothesis given the evidence (aposteriori);
P(e/h) is the probability of the evidence given the hypothesis (conditional probability);
P(h) is the probability of the hypothesis (apriori); and
P(e) is the probability of the evidence (total probability).
The Bayes theorem is widely used in the telecommunications area, punctually in the ICI problem.
Li et al., (2016) study the Dynamic Frequency Resource Allocation [DFRA] through the deploy of microcells called Small Cell Base Station [SCBS]and following a Poisson model. They achieve an adaptive scheme through the application of the Bayesian theory in the correlated spectral use of the adjacent SCBS. For this, they model the SCBS and the UE as a homogeneous Poisson process, with intensities λBS and λUE respectively. Here, each UE is associated with the closest SCBS waiting an amount N0 of subcarriers, where this N0 is a preestablished value.
In simple terms, Madariaga, Rodríguez, Lozano, and Vallejo (2013) define the linear regression as a mathematical model expressing the relations between a dependant variable and one or many independent variables by using a straight line (regression line). Equation (2) presents its general form.

where:
a represents the ordinate because its value is the point where the regression line crosses the vertical axis; and
b represents the slope of the line.
In the ICI field, it is possible to elaborate a model or function (generally a straight line) through this technique from a set A(x,y) ∈ R(n+1), where its elements would be the parameters to use. The model assumes a linear relation between the x,y variablesand it fits in a straight line to the data points (see Figure 7). This relation is expressed through a hypothesis function (Equation 3) to predict a set of outputs (Shanthamallu, Spanias, Tepedelenlioglu, & Stanley, 2017).


where:
x1,x2,...xn are the parameters; and
w1,w2,...wn are the model values.
Sierra and Marca (2015) present a statistical approach based on multivariable polynomial regression consisting of two phases. The first one seeks to find the relation between the explanatory variables: cell traffic state and Cell Range Expansion [CRE] with the dependent variables: Packet Loss Ratio [PLR] and Packet Data Convergence Protocol [PDCP] through a model ruled by the function (4):

where:
x is the number of active UE; and
y the compensation femtocells.
The second phase consists of the CRE displacement selection; the femtocell displacement is dynamically determined by following the model extracted in the previous phase. Hence, the packet loss in the PDPC layer is minimized according to the condition of the traffic in the sector. Finally, a simulation at the system level is performed, where the efficiency of the dynamic CRE is presented.
Bojović, Meshkova, Baldo, Riihijärvi, and Petrova (2016) research the use of several ML techinques combined with statistical regression and they present an approach based on learning to achieve self-optimization in a SON deployment. In their proposal, the learning capabilities are focused on the estimation of KPI within the user and network levels to select the optimal configuration affecting the whole LTE stack. This is done via the Dynamic Frequency and Bandwidth Allocation [DFBA] technique in LTE microcells. They compare their results with other ML approaches, concluding that theirs is a centralized frequency allocation approach for microcells.
Likewise, the authors take as input data the performance measurements in the frequencies configurations, which allows to estimate the impact in the configuration and predict the performance. The results show that the DFBA based on the learning achieves —on average— an improvement in the performance of 33% relative to the approaches based on analytical models, reaching up to 95% of the performance.
This concept arose as an attempt to simulate the behavior of the human brain in computers (Hinton, Srivastava, & Swersky, 2014). The brain is capable of performing highly complex, non-linear calculations in parallel continuously. By dividing these functions in very basic components, known as neurons, and giving them the same calculation function, a simple algorithm can become a robust computing tool.
Li, Liang, and Ascheid (2016) present a proposal for eICIC in HetNets by involving the relation between the CRE and ABS for the dynamic allocation of RB. They consider a HetNet scenario with several users, composed by macrocells ʍ={1,2,3....M} and a set of picocells Ƥ={1,2,3....P}, together with a vector of moving users Ս={1,2,3....U}.In order to improve the download capabilities of a small cell, the CRE strategy is employed by adding a positive compensation to the Reference Signal Received Power [RSRP]in the downlink [DL] according to the imperative weight of each element of the users’ vector. This information —together with the historic data of the users—, the channel information, the optimal CRE and ABS patterns, and the SINR vectors are the required supplies to train the machine learning Artificial Neural Network [ANN] algorithm and achieve as output a resource allocation pattern for macro and pico users
The SVM technique uses a subset of training data called support vectors, where a mapping —linear or non-linear, polynomial or Gaussian— is applied and where the objective is to maximize the distance between several classes; that is, act as a classifier. The support vectors are the closest samples to the decision zone, in the case showed in Figure 8 they are the green circles, the most difficult points to classify; whilst that the shaded region indicates the optimal gathered decision limit (Klaine et al., 2017).
Cordina and Debono (2017) study techniques combined with SVM through the adaptation of links and the so-called Frequency Selective Scheduling [FSS] to increase the throughput in mobile services. Through a signaling process, the information delivered by the eNodeB regarding the Channel Quality Indicator [CQI] is used. These data are feedback by the UE data and given that the signaling load increases due to this feedback, a machine learning technique is exploited to address this issue. The authors also propose a new CQI feedback compressing scheme in the sub-bands to predict the channel state through the use of the SVM vector. This information is transmitted to the eNB to recover all the channel sub-band response within a configurable error margin.
In the unsupervised learning, there are no tagged data for the training and the algorithm receives a set of untagged inputs, i.e., the outputs are unknown. Its objective is to search groupings based on similar features to find some structure or way to organize the data (Klaine et al., 2017).Some examples of unsupervised algorithms applicable to ICIC are the game theory (Miramá & Quintero, 2016), load balancing (Gao et al., 2014), cell interruption management (Moysen et al., 2014), and the genetic algorithms (Trabelsi, Chen, Azouzi, Roullet, & Altman, 2017).In the following sections, we describe some of the most employed algorithms used for ICIC.
This unsupervised learning algorithm is one of the most used in cellular systems (Klaine et al., 2017).Its function is to group data without tags and find their centers from two parameters: the initial dataset and the desired number of clusters (see Figure 9).
Some of its applications can be found in the work of Hajjar, Aldabbagh, and Dimitriou (2015) where, through the grouping and classification of low-power nodes, a Cluster Head [CH] or main group is selected. This CH oversees the direct communication with the eNBs; on the other hand, a secondary —or slave— group is also selected to oversee the retransmissions. For the classification, the SINR parameter is employed depending of the service type and the number of associated slaves. Furthermore, a total of 100 RBs are assigned for all the cell, which are distributed one to one in the UE. For the blocked UE, an appropriate CH from the connected ones is assigned to retransmit their information to the eNB. In this proposal, the K-means algorithm is mixed with the Hierarchical Agglomerative Clustering [HAC] one.
Within the HetNet topic, Qi, Zhang, Chen and Zhang (2018) seek self-organization for macrocells and load equilibrium for different transmission powers. This is done via a User-Based K-means Algorithm [UBKCA], where it firstly classifies the UE groups in the center to reduce the computing complexity. After, it uses the CRE and the edge user factor to improve the load equilibrium. The optimal combination of these two elements produce CIC, both in accuracy and in the UE download factor. Using the decision limits calculated with the algorithm, a closed-loop SON system is implemented. The used model defines a dataset R={R1,R2,R3....Rn}divided in k groups through an analysis. After, there are chosen randomly the centers of each group included in a set O={O1,O2....Ok}, to finally assign the n-kremaining elements to its closest center to form clusters and then calculate the new center for each group. This task is repeated until the system obtains convergence.
This is a mathematical tool based on the hypothesis established by Neumann, Morgenstern, and Nash. This last proposed the Nash Equilibrium [NE] concept and the modelling is like the RL algorithms described in the following section, where multiple agents take actions in a competitive environment and, at the end of the game and depending of the result of each agent, an award or a penalization are granted (Miramá & Quintero, 2016).
These algorithms can be applied in ICIC, as Morel and Randriamasy (2017) describe. Their work is focused on improving the Quality of Experience [QoE], a different metric than the usual Quality of Service [QoS]. The QoE subjectively assesses how a user perceives the service satisfaction degree within a HetNet scenario with larger HD video demands. That is, it is focused on a specific utility and not in the general network performance. This approach seeks to optimize the silence period of macro and microcells between eNBs and connected users through a centralized optimizer and coordinator. Besides, it seeks to achieve a direct impact in the spatial distribution and in the SINR by using the ABS and Cell Individual Offset [CIO] parameters to select the optimal pairs and limit the interference caused by the edge UE while the QoE is improved.
This learning method is similar to the previous one, with the main difference in being focused on the system agents; they analyze their current status and their environment. According to this analysis, they choose the most adequate action (Miramá & Quintero, 2016). RL differs from other algorithms in the process performed after the action is chosen (see Figure 10), which is divided in four steps (Klaine et al., 2017):
• Policies, which associate statuses with the actions to be taken by the agent;
• Reward function, assessing the current state and gathering a reward or penalization as per the results of the previous performed action;
• Value function, estimating the future reward when the agent applies an action in the current state; and
• Environment model, determining the status and possible actions the agent can take.
The proposal by Moysen et al., (2014) applies an RL approach, where Cell Outage Compensation [COC] is achieved in a self-organized way. This is done by using a learning scheme capable to make online decisions in each eNB. To implement this approach, some concepts such as Temporal Difference [TD]and Actor Critic [AC] are merged, plus a modified FFR scheme.
These tools provide the continuous interaction with the cellular scenario and they allow to learn from experience. Consequently, adaptation for changes in terms of user mobility, coverage and interferences—among others—are achieved. The authors assume the hypothesis that the network has been able to detect the service degradation and identify the interruption to improve the time response using an applied RL to the automatic repairing. This type of approach requires low complexity and it achieves self-organization in an efficient and fast way.
Razavi et al., (2010) combine the fuzzy logic with the reinforced learning through the definition of an efficient learning process via the appropriate reinforcement signal distribution (selection of multiple simultaneous states combined with fuzzy logic). This entails an automatic optimization of the coverage through the adjustment of the downtilt angle and achieving self-optimization. This work highlights among others due to the fact that the reinforced learning states are fed by real input parameters; other similar solutions present the scenario as a finite state Markov decision process (Eberle, 2015). This causes the system to be static and do not consider the environment changes. Although this is an effective solution, it is totally distributed between LTE base stations; the current proposals are centralized such as Bojović et al., (2016) and Supratim and Monogioudis (2015) show. Nevertheless, the scheme presents better performance relative to a similar proposal with learning fuzzy logic rules and optimal parameters are gathered in a global scale for the given scenario.
The Q-Learning [QL] approach is a type of RL algorithm where an agent creates a table of qvalues. By exploring the environment, the q function acts as a value function by estimating future awards in current actions. The q values are stored in a table for each possible action and state. Nevertheless, the table can be highly complex and non-applicable for large state action spaces (Daeinabi& Sandrasegaran, 2014).
QL is a very popular algorithm within the ICIC applications. Ming, Ye, and Xinyu (2017) propose a QL multi-agent model based on the resource allocation, where two paradigms are introduced inside a HetNet environment: a QL distributed algorithm applicable to picocells, which learn from the environment if they should share information; and a centralized QL algorithm, where different agents interact amongst them by sharing resources and scheduling information. Unlike the traditional RL approaches, the QL algorithms do not require a precise model; hence, these latter are chosen to be applied in the resources allocation. The proposed model has M macrocells, which conform a layer of N clusters together with K picocells randomly distributed inside each cluster. Likewise, Um and Up are defined as users for the macro and picocells, respectively. Each user selects the cell to use according to the RSRP parameter and they are randomly blocked inside the coverage areas. For this, the small cells inside the same cluster must share partial information entailing to gather better performance results.
On the other hand, QL can be combined with other techniques such as fuzzy logic, as Daeinabi and Sandrasegaran (2014) propose. They analyze the CRE producing ICI in the downlink inside a HetNet and as a solution, they propose a dynamic ABS allocation system by using a QL approach combined with fuzzy logic. This allows to tackle the issue of a complex table with q continuous values by transforming it into a table with discrete values combining numerical and linguistic data; these latter are obtained through the operator experience. A QL fuzzy controller is included to process the fuzzy state variables such as the number of macro/micro UE, number of ABS frames, throughput of macro UE, and SINR. The objective of the controller is to find the ∆ABS value that is added to the current ABS value for each macrocell. Each eNB performs local tasks to find this value and then uses the optimal ABS value for itself and for its own picocells.
Morozs et al., (2015)start from the Dynamic Spectrum Access [DSA] concept in LTE networks to propose an improved ICIC alternative through a distributed RL algorithm called Distributed ICIC Accelerated Q-learning [DIAQ]. This proposal is also Heuristically Accelerated [HARL] to mitigate the disadvantage that the Q-learning algorithms have relative to require many learning interactions. Such acceleration is particularly provided in the multi-agent domain, leading the exploration process by using additional heuristic information. The scheme is assessed in a stadium-like simulated scenario. The work also describes a generic hexagonal architecture and the usage of the X2 interface for the exchange of ICIC signals. Here, a central eNB sends ICIC signals to the neighbor eNBs. Additionally, a theoretical analysis of a Bayesian network was performed pursuing a new DSA approach using RL. Consequently, this work presents an interesting combination of learning and heuristic techniques, achieving a significative performance through accelerating the algorithm. This produces important results for the reduced time response, which is similar to the pursued objective: reduce the computing complexity.
This type of algorithms corresponds to simple schemes governed by a set of rules seeking to take the best decision for the system in a given moment. Because they fit as unsupervised algorithms, they are applied when there is no known solution of a particular problem; however, due to their simplicity, the solutions tend to be approximated and suboptimal. Likewise, the meta-heuristics have a similar approach defined by a set of rules, but they are more complex and higher level, producing more optimal solutions (Klaine et al., 2017).
In the ICIC context, it is possible to combine heuristic techniques with other models such as Supratim and Monogioudis (2015) present.Their proposal called Learning-based Adaptive Power control [LeAP] is a machine learning approach based on data measurements (power control parameters) to manage the interference in the uplink. Their proposal is composed of two key elements: the design of measurements statistics for the UE, basic but concise data, enough to determine interference patterns in the network environment; and the design of two algorithms based on learning, one of them seeking the optimal power control parameters, the other is a fast heuristic capable to be implemented using solvers commercially available. The benefit evaluation of this approach is performed using a complex urban scenario in a train station in the United States, where it is demonstrated that there is an improvement in the data rate in the cell edges. Compared with similar approaches, this solution can be implemented by using a SON server centralized architecture.
These are heuristic algorithms inspired in nature concepts, since they evolve from the development of a solutions family where the most optimal solution is identified after a certain number of generations. Even though they are simple schemes, they are applied in complex problems (Jain, 2017).
Kashaf et al., (2017) propose an approach based on GA to reduce the interference in LTE networks and, at the same time, improve the network performance. Their solution operates in an environment with variable traffic, user positions, and propagation conditions. These variations are modelled with non-linear and probability functions. GA is a machine learning technique that models the biological process of evolution by involving the concept of generations where, in this case, represents the scenario to improve the network KPI. Several generations are simulated to achieve the self-optimization with the most adequate results. The optimized KPI are denoted as acceptance rate and they represent the file transfer time and the average bit speed. For the system modelling, the frequency is divided in three sub-bands: one in the edge, two in the central region. The users with the worst channel quality are assigned to the protected edge band; although they are mainly in the cell edges, they also can be in the center and experiment fading conditions. If the edge band in a cell is completely busy, the remaining users present an allocation of PRB in the central bands. The simulation results show improvements in the call average successful ratio of calls and in the throughput.
Gao et al., (2014) propose the Intelligent Fractional Frequency Reuse [I-FFR] for an LTE network with femtocells by considering two regions: Cell Center Region [CCR] and Cell Edge Region [CER]. This latter is divided in sectors and a combination of genetic with graph recognition algorithms is applied; this because in the FFR conventional schemes, the CCR relation and the reuse factor in CER are fixed. Furthermore, the RB are distributed in a uniform way to each sector in CER. Consequently, they are schemes that do not present a resource allocation strategy for the schemes with dense femtocells that change dynamically. Hence, the main contribution of this work consists of adjusting the CCR proportion and the reuse factor in CER adaptively, achieving the allocation of resource blocks for each sector in a proportional manner. Consequently, a system model with three aspects is defined: the region division defining the coverage ratios of the macro and femtocells; the users’ distribution where two sets are assumed: Macro User Equipment [MUE] and Fem to User Equipment [FUE]; and the resource allocation aspect where a total set of RBs is assumed. Further, this work also defines a scenario with a density of 100 users evenly distributed in the macrocell and a single user per femtocell. This entails a total of 100 femtocells and the involving of a FFR intelligent scheme is involved.
The intercell interference coordination is a problem with growing interest in LTE/LTE-A networks due to the current complexity of these networks caused by the high users’ density and the variable load traffic and the deployments of heterogeneous networks. Although there are many techniques to address the ICI issue including the frequency reuse, these are static techniques that do not adapt to the high traffic variations in the current networks.
The machine learning techniques are schemes that allow the ICI to be treated from the point of view of the self-organization, i.e., grant that the cellular networks have the capacity to adjust its network parameters and adapt to the environment variations. In order to achieve efficient algorithms, it is necessary to ensure a careful data processing, parameter that in the mobile network area, is represented by the network parameters (inputs) and the metrics to assess or KPI (outputs).
To tackle the ICI issue through ML, it is recommendable to generate a flow diagram where the input parameters to use are defined —e.g., the amount and distribution of UEs, the power and distribution of macro and micro cells, etc.—. Secondly, define the expected results through the KPI selection —e.g., throughput and spectrum efficiency—, to finally generate the model in charge of training the algorithm. In this last step, the selection of the ML technique and the definition of the mathematical model should be done. Although it is true that there are many ML techniques, the selection of the most adequate depends of the type of available data, the environment of the problem, and the expected results for each particular case.
Cómo citar: Trejo, O. & Miramá, V. (2018). Machine learning algorithms for inter-cell interference coordination, Sistemas & Telemática, 16(46), 37-57. doi:10.18046/syt.v16i46.3034






