Cognitive Radio Networks: Highlights of Information Theoretic Limits, Models and Design

In recent years, the development of intelligent, adaptive wireless devices called cognitive radios, together with the introduction of secondary spectrum licensing, has led to a new paradigm in communications: cognitive networks. Cognitive networks are wireless networks that consist of several types of users: often a primary user (the primary license-holder of a spectrum band) and secondary users (cognitive radios). These cognitive users employ their cognitive abilities to communicate without harming the primary users. The study of cognitive networks is relatively new and many questions are yet to be answered. In this article we highlight some of the recent information theoretic limits, models, and design of these promising networks.

holders obtain from the FCC the exclusive right to transmit over their spectral bands. Since most of the bands have been licensed, and the unlicensed bands are also rapidly filling up, it would appear that a spectral crisis is approaching. This, however, is far from the case. Recent measurements have shown that for as much as 90% of the time, large portions of the licensed bands remain unused. As licensed bands are difficult to reclaim and release, the FCC is considering dynamic and secondary spectrum licensing as an alternative to reduce the amount of unused spectrum. Bands licensed to primary users could, under certain negotiable conditions, be shared with nonprimary users without having the primary licensee release its own license. Whether the primary users would be willing to share their spectrum would depend on a number of factors, including the impact on their own communication.
The application of cognitive networks, however, is not limited to just fixing the current spectrum licensing. Other applications abound in shared spectra, such as the ISM band (where different devices need to coexist without inhibiting each other), sensor networks (where the sensors may need to operate in a spectrum with higher power devices), and current services such as the cellular network (where the operator may want to offer different levels of services to different types of users). All of these possibilities motivate the study of cognitive networks.
Cognitive radios-wireless devices with reconfigurable hardware and software (including transmission parameters and protocols)-are capable of delivering what these secondary devices would need: the ability to intelligently sense and adapt to their spectral environment. By carefully sensing the primary users' presence and adapting their own transmission to guarantee a certain performance quality for the primary users, these cognitive devices could dramatically improve spectral efficiency. Along with this newfound flexibility comes the challenge of understanding the limits of and designing protocols and transmission schemes to fully exploit these cognitive capabilities. In order to design practical and efficient protocols, the theoretical limits must be well understood.

INTRODUCTION AND PRELIMINARIES
In this article, we outline some recent results on the fundamental information theoretic and communication theoretic limits of cognitive networks. The general cognitive network exploits cognition at a subset of its nodes (users). Cognition may take various forms of learning and adapting to the environment. We focus on cognition in the form of nodes having extra information, or side information, about the wireless environment in which they transmit. For example, they may sense the presence of the primary nodes by listening to its beacon or may be able to decode certain overheard primary messages. We then discuss the questions: a) How can the nodes most efficiently exploit the available side information? and b) How do the cognitive users affect the primary users in terms of interference; how should one design network parameters to ensure efficient secondary communication while guaranteeing primary performance?

SMALL NETWORKS: ACHIEVABLE RATE AND CAPACITY REGIONS
One of information theory's main contributions is the characterization of fundamental limits of communication. A communication channel is modeled as a set of conditional probability density functions relating the inputs and outputs of the channel. Given this probabilistic characterization of the channel, the fundamental limits of communication may be expressed in terms of a number of metrics, of which capacity is one of the most known and powerful. Capacity is defined as the supremum over all rates (expressed in bits/channel use) for which reliable communication may take place. The additive white Gaussian noise (AWGN) channel with quasistatic fading is the example we will discuss the most in this article. In the AWGN channel, the output Y is related to the input X according to Y = hX + N, where h is a fading coefficient (often modeled as a Gaussian random variable), and N is the noise which is N ∼ N (0, 1). Under an average input power constraint E[|X| 2 ] ≤ P, the well-known capacity, when h is a fixed and known constant, is given by where SINR is the received signal to interference plus noise ratio, and C(x) := (1/2) log 2 (1 + x). We will assume the channel gain(s) h is fixed and known to all relevant transmitters and receivers for the rest of this section. On the other hand, when the channel undergoes fast fading, we speak of the ergodic channel capacity given by C = (1/2)E h [log 2 (1 + |h| 2 P)] . If the channel undergoes slow fading, the outage capacity would be the appropriate metric to consider.
While capacity is central to many information theoretic studies, it is often challenging to determine. Inner bounds, or achievable rates, as well as outer bounds to the capacity may be more readily available. This is particularly true in channels in which multiple transmitters and multiple receivers wish to communicate simultaneously. Indeed, multiuser information theory or network information theory is a challenging field with a plethora of open questions. As an example, one of the central, and simplest of multiuser channels is the information theoretic interference channel. This channel consists of two independent transmitters that wish to communicate independent messages to two independent receivers. Although the channel capacity region is known in certain cases, the general capacity region, despite promising recent advances [1], [2], remains a mystery. Capacity regions and achievable rate regions are natural extensions of the notions of capacity and achievable rate to higher dimensions. At the crux of this lies the information theory community's lack of understanding of how to deal with interference and overheard, undesired information.

COGNITIVE USERS EMPLOY THEIR COGNITIVE ABILITIES TO COMMUNICATE WITHOUT HARMING THE PRIMARY USERS.
In the remainder of this section, we consider "small" cognitive networks, or networks with a tractable number of primary and cognitive nodes. In these networks, we are able to find exact achievable rate and capacity regions. We illustrate the impact of different types of cognition on the achievable rates by outlining precise ways in which the cognitive users may exploit the given side information. The basic and natural conclusion is that, the higher the level of cognition at the cognitive terminals, the higher the achievable rates. However, increased cognition often translates to increased complexity. At what level of cognition future secondary spectrum licensing systems will operate will depend on the available side information and network design constraints.
In the next section, we will consider large cognitive networks. For such networks, the different communication possibilities explode, and to date, precise network capacity region results are lacking. For large networks, the asymptotic performance of the network is often more tractable. For example, the interesting question of how the network's throughput scales as the number of users increases to infinity has received significant attention in the past years. We illustrate how network scaling results differ in a cognitive setting from those seen in traditional ad hoc network settings. Specifically, node distributions, the way nodes pair up to communicate and routing protocols, all play a significant role in the resulting throughput scaling.

ACHIEVABLE RATE AND CAPACITY REGIONS
We first look at a simple network in which a single primary transmitter-receiver (P T x , P R x ) pair and a single cognitive transmitter-receiver (S T x , S R x ) pair wish to share the wireless channel, as shown on the left of Figure 1. Since the secondary user is a cognitive radio, the natural question to ask is: how does it exploit its own flexibility to improve communication rates? The intuitive answer is, of course, that it depends on what the cognitive transmitter knows about its wireless environment, i.e., what additional side information the cognitive transmitter or receiver has to exploit. We will outline four examples of cognition at the secondary nodes, four possible corresponding trans-mission strategies, and their respective illustrative rate regions which convey the general conclusion: the greater the side information and cognitive abilities (including computation), the larger the achievable rate regions. For the remainder of this section, we assume that outputs at the primary and cognitive receivers, Y p and Y c respectively, are related to the inputs at the primary and cognitive transmitters X p and X c , respectively, as Here h 12 , h 21 are the quasi-static fading coefficients assumed to be known to all transmitters and receivers. The rate achieved by the primary and cognitive Tx-Rx pairs are R 1 , and R 2 respectively, measured in (bits/channel use).

SPECTRAL-GAP FILLING (WHITE-SPACE FILLING) APPROACH
An intuitive first approach to secondary spectrum licensing is a scenario in which cognitive radios sense the spatial, temporal, or spectral voids and adjust their transmission to fill in the sensed white spaces. This approach has also been called the interference-avoiding paradigm [3]. Cognition in this setting indicates the ability to accurately detect the presence of other wireless devices; the cognitive side information is knowledge of the spatial, temporal, and spectral gaps a particular cognitive Tx-Rx pair would experience. Cognitive radios could adjust their transmission to fill in the spectral (or spatial/temporal) void, as illustrated in Figure 2. If properly implemented, this simple and intuitive scheme could drastically improve the spectral efficiency of currently licensed bands. This white-space filling strategy is often considered to be the key motivation for the introduction and development of cognitive radios.
To determine upper bounds on the communication possible, we assume that knowledge of the spectral gaps is perfect: when primary communication is present the cognitive devices are able to precisely determine it, instantaneously. That is, in a realistic system the secondary transmitter would spend some  of its time sensing the the channel to determine the presence of the primary user. For simplicity, we assume this sensing time is zero and the primary is always perfectly detected. While such assumptions may be valid for the purpose of theoretical study, practical methods of detecting primary signals have also been of great recent interest. A theoretical framework for determining the limits of communication as a function of the sensed cognitive transmitter and receiver gaps is formulated in [4]. Because current secondary spectrum licensing proposals demand detection guarantees of primary users at levels at extremely low levels in harsh fading environments, a number of authors have suggested improving detection capabilities through allowing multiple cognitive radios to collaboratively detect the primary transmissions [5], [6]. Assuming ideal detection of the primary user, and assuming the cognitive radio is able to perfectly fill in the spectral gaps, the rates R 1 of the primary Tx-Rx pair and R 2 of the cognitive Tx-Rx pair achieved through ideal white-space filling are shown as the inner white triangle of Figure 1. The intersection points of the axes are the rates achieved when a single user transmits the entire time in an interference-free environment. The convex hull of these two interference-free points may be achieved by time-sharing time division multiple access (TDMA) fashion, where the actual average rate the secondary link may expect depends on the temporal usage statistics of the primary transmitter. If the primary and secondary power constraints are P 1 and P 2 , respectively, then the white-space filling rate region may be described as White-space filling region (a) On a practical note of interest, the FCC is in the second phase of testing white-space devices from a number of companies and research labs. Thus, the white-space filling approach is readily approaching, further advancing the need for new boundarypushing technologies and transmission schemes.

SIMULTANEOUS, CONTROLLED TRANSMISSION (INTERFERENCE TEMPERATURE)
While white-space filling demands that cognitive transmissions be orthogonal (in, for example, space, time, or frequency) to primary transmissions, another intuitive approach to secondary spectrum licensing would involve nonorthogonal transmission. Rather than detecting white spaces, a cognitive radio would simultaneously transmit with the primary device. It would use its cognitive capabilities to determine at what power level it should transmit so as not to harm the primary transmission. While the definition of harm may be formulated mathematically in a number of ways, one common definition involves the notion of interference temperature. Interference temperature corresponds to the average level of interference power seen at a primary receiver. In secondary spectrum licensing scenarios, the primary receiver's interference temperature should be kept at a level that will satisfy the primary user's desired quality of service. That is, primary transmission schemes may be designed to withstand a certain level of interference, which cognitive radios or secondary nodes may exploit for their own transmission. Provided the cognitive user knows 1) the maximal interference temperature for the surrounding primary receivers, 2) the current interference temperature level, and 3) how its own transmit power will translate to received power at the primary receiver, then the cognitive radio may adjust its own transmission power so as to satisfy any interference temperature constraint the primary user(s) may have.
This interference-temperature controlled transmission scheme falls under the interference-controlled paradigm of [3] and has received a great deal of attention from the academic community. The works [7] and [8] consider the capacity of cognitive systems under various receive-power (or interference-temperature-like) constraints.
We consider a simple scenario in which each receiver treats the other user's signal as noise, providing a lower bound to what may be achieved using more sophisticated decoders. The rate region obtained is shown as the light grey region in Figure 1(b) . This region is obtained as follows: we assume the primary transmitter communicates using a Gaussian code book of constant average power P 1 . We assume the secondary transmitter allows its power to lie in the range [0, P 2 ] for P 2 some maximal average power constraint. The rate region obtained may be expressed as The actual value of P * 2 chosen by the cognitive radio depends on the interference temperature, or received power constraints at the primary receiver.
[FIG2] One of the simplest instances of cognition: a cognitive user senses the time/frequency white spaces and opportunistically transmits over these detected spaces.

OPPORTUNISTIC INTERFERENCE CANCELLATION
We now increase the level of cognition even further. We assume the cognitive link has the same knowledge as in the interferencetemperature case (b) and has some additional information about the primary link's communication: the primary user's codebook. Primary codebook knowledge translates to being able to decode primary transmissions. We suggest a scheme which exploits this extra knowledge next.
In opportunistic interference cancellation, as first outlined in [9] the cognitive receiver opportunistically decodes the primary user's message, which it then subtracts off its received signal. This intuitively cleans up the channel for the cognitive pair's own transmission. The primary user is assumed to be oblivious to the cognitive user's operation, and so continues transmitting at power P 1 and rate R 1 . When the rate of the primary user is low enough relative to the primary signal power at the cognitive receiver (or R 1 ≤ C(h 2 12 P 1 )) to be decoded by S R x , the channel (P T x , S T x → S R x ) will form an information theoretic multiple-access channel. In this case, the cognitive receiver will first decode the primary's message, subtract it off its received signal, and proceed to decode its own. When the cognitive radio cannot decode the primary's message, the latter is treated as noise.
Region (c) of Figure 1 illustrates the gains opportunistic decoding may provide over the former two strategies. It is becoming apparent that higher rates are achievable when there is a higher level of cognition in the network which is properly exploited. What type of cognition is valid to assume will naturally depend on the system/application.

COGNITIVE TRANSMISSION (USING ASYMMETRIC TRANSMITTER SIDE INFORMATION)
Thus far, the side information available to the cognitive radios has been knowledge of the primary spectral gaps, knowledge of the primary interference constraints and channel gain h 21 , and primary codebooks. We increase the cognition even further and assume the cognitive radio has the primary codebooks as well as the message to be transmitted by the primary sender. This would allow for a form of asymmetric cooperation between the primary and cognitive transmitters. This asymmetric form of transmitter cooperation, first introduced in [10], can be moti-vated in a cognitive setting in a number of ways. For example, if S T x is geographically close to P T x (relative to P R x ), then the wireless channel (P T x → S T x ) could be of much higher capacity than the channel (P T x → P R x ). Thus, in a fraction of the transmission time, S T x could listen to, and obtain the message transmitted by P T x .
Although in practice the primary message must be obtained causally, as a first step, numerous works have idealized the concept of message knowledge: whenever the cognitive node S T x is able to hear and decode the message of the primary node P T x , it is assumed to have full a priori knowledge. This assumption is often called the genie assumption, as these messages could have been given to the appropriate transmitters by a genie. The oneway double-yellow arrow in Figure 3 indicates that S T x knows P T x 's message but not vice versa. This is the simplest form of asymmetric noncausal cooperation at the transmitters. The term cognitive is used to emphasize the need for S T x to be a device capable of obtaining the message of the first user and altering its transmission strategy accordingly.
This asymmetric transmitter cooperation (cognitive channel) has elements in common with the competitive channel and the cooperative channels of Figure 3. In the competitive channel, the two transmitters compete for the channel, forming a classic information theoretic interference channel. The largest to-date known general region for the interference channel is that described in [11]. Many of the results on the cognitive channel, which contains an interference channel if the non-causal side information is ignored, use a similar rate-splitting approach to derive large rate regions [10], [12], [13]. At the other extreme lies the cooperative channel in which the two transmitters know each others' messages prior to transmission. This corresponds to the information theoretic two transmitantenna broadcast channel. A powerful and surprising technique called dirty-paper coding was recently shown to be capacity achieving in multiple-input, multiple-output (MIMO) Gaussian broadcast channels [14]. This technique, the application of Gel'fand-Pinsker coding [15] to Gaussian noise channels, as first described in [16] is applicable to channels in which the interference (or dirt) a receiver will see is noncausally known at the transmitter. By careful encoding, the channel with interference noncausally known to the transmitter, but not the receiver, may be made equivalent to an interference-free channel, at no power penalty. When using an encoding strategy that properly exploits this asymmetric message knowledge at the transmitters, the region (d) of Figure 1 is achievable and in certain cases corresponds to the capacity region of this channel [17], [18]. The encoding strategy used assumes both transmitters use random Gaussian codebooks. The primary transmitter continues to transmit its message of average power P 1 . The secondary transmitter, splits its transmit power P 2 into two portions, P 2 = α P 2 + (1 − α)P 2 for 0 ≤ α ≤ 1 . Part of its power α P 2 , is spent in a selfless manner: on relaying the message of P T x to P R x . The remainder of its power (1 − α)P 2 is spent in a selfish manner on transmitting its own message using the interference-mitigating technique of dirty-paper coding. This strategy may be thought of as selfish, as power spent on dirty-paper coding may harm the primary receiver (and is indeed treated as noise at P R x ). The rate region (d) may be expressed as By varying α, we can smoothly interpolate between a strictly selfless manner to a strictly selfish manner. Of particular interest from a secondary spectrum licensing perspective is the fact that the primary user's rate R 1 may be strictly increased with respect to all other three cases (i.e., the xintercept is now to the right of all other three cases) That is, by having the secondary user possibly relay the primary's message in a selfless manner, the system essentially becomes a 2 × 1 multiple-input, single--output (MISO) system which sees all the associated capacity gains over noncooperating transmitters or antennas. This increased primary rate could motivate the primary user to share its codebook and message with the secondary user(s). The interference channel with asymmetric, noncausal transmitter cooperation was first introduced and studied in [10]. It was first called the cognitive radio channel and is also known as the interference channel with degraded message sets. Since then, a flurry of results, including capacity results in specific scenarios, of this channel have been obtained. When the interference to the primary user is weak (h 21 < 1), rate region (d) has been shown to be the capacity region in Gaussian noise [18] and in related discrete memoryless channels [19]. In channels where interference at both receivers is strong both receivers may decode and cancel out the interference, or where the cognitive decoder wishes to decode both messages, capacity is known [12], [20]. However, the most general capacity region remains an open question for both the Gaussian noise as well as discrete memoryless channel cases. This 2 × 2, noncausal cognitive radio channel has been extended in a number of ways. While the above channel assumes noncausal message knowledge, a variety of twophase half-duplex causal schemes have been presented in [10] and [21], while a full-duplex rate region was studied in [22]. Many achievable rate regions are derived by having the primary transmitter exploit knowledge of the exact interference seen at the receivers (e.g., dirty-paper coding in AWGN channels). The performance of dirty-paper coding when this assumption breaks down has been studied in the context of a compound channel in [23] and in a channel in which the interference is partially known [24]. Extensions to channels in which both the primary and secondary networks form classical multiple-access channels has been considered in [25], while cognitive transmissions using multiple-antennas, without asymmetric transmitter cooperation has been considered in [26]. We next explore the throughput scaling laws of large cognitive networks, where exact achievable rate regions remain elusive.

LARGE NETWORKS: THE SCALING LAWS
As single-link wireless technologies have matured over the past decades, it is of great interest to determine how these devices perform in larger networks. These networks can contain primary and cognitive users that are ad hoc (ad hoc cognitive networks), or they can contain some infrastructure support for the primary users (infrastructure-supported cognitive networks). Applications of these networks abound: for example, mobile IP networks, smart home devices, spontaneously formed disaster recovery or military networks, and dispersed sensor networks.
Contrary to the well-understood capacity of a point-to-point link, the capacity of a network remains less defined. Multiple dimensions play a role: the number of nodes in the network, the node density, the network geometry, the power and rate of each node. These multiple dimensions make characterizing the capacity regions of a network particularly challenging. An initial step to understanding the network capacity is looking at its sum rate, or throughput. This measure is particularly relevant in a large network, in which nodes can join the network at random and the network size can grow to be large. What order of throughput the network can sustain as more nodes join is of particular interest. This throughput order is often referred to as the scaling law of the network-the growth of the network throughput versus the number of users.
The scaling of a network's throughput often depends on a number of factors: the network geometry, the node distribution, the node's physical-layer processing capability, and whether there is infrastructure support. Two often studied network geometries are dense and extended networks. Dense networks have constant area and increased node density as more nodes join the network. In contrast, extended networks have constant node density and increased area with more nodes. Such a network geometry can affect the scaling law significantly, since dense networks can be interference limited while extended

WHETHER PRIMARY USERS WOULD BE WILLING TO SHARE THEIR SPECTRUM WOULD DEPEND ON A NUMBER OF FACTORS, INCLUDING THE IMPACT ON THEIR OWN COMMUNICATION.
networks are often power limited. Scaling results for one type of network, however, can often be transformed to that of the other after appropriate power scaling. For this reason, we will focus on extended networks subsequently. Before examining the scaling law of a cognitive network that contains different types of users (heterogeneous), it is instructive to discuss recent results on the scaling law of an extended, homogeneous network. For homogeneous, ad hoc networks, in which n nodes of the same type are located randomly, the scaling law depends strongly on the node distribution and the physical-layer processing capability, more specifically the ability to cooperate among nodes. In the interference-limited regime, in which no cooperation is allowed (except simple forwarding) and all nodes treat other signals as interference, the per node throughput (which equals the sum rate divided by n) scales at most as 1/ √ n [27]. If the nodes are uniformly distributed, a simple nearestneighbor forwarding scheme achieves only 1/(n log(n)) per node throughput [27]. When the nodes are distributed according to a Poisson point process, however, a backbonebased routing scheme achieves the per node scaling of 1/ √ n [28], meeting the upper bound.
On the other hand, when nodes are able to cooperate, a much different scaling law emerges. Upper bounds based on the max-flow min-cut bound [29]- [31] as well as MIMO techniques [30] have been analyzed for various ranges of path loss exponent. For path loss α between two and three, a hierarchical scheme can achieve a throughput growth as n 2−α/2 [30] (asymptotically linear for α = 2). Here nodes form clusters; nodes within a cluster exchange information and then cooperate to communicate to nodes in another cluster. This cluster formation may be layered (clusters of clusters), forming a hierarchical scheme in which eventually all nodes will be able to cooperate in a MIMO fashion. For path loss greater than three the nearest-neighbor multihop scheme is scaling-optimal and achieves a throughput of order √ n. For infrastructure-supported networks, such as the cellular, WiFi, or TV network, the scaling law can be improved if the infrastructure density is above a critical value [32]. Other factors that affect the scaling law of an infrastructure-supported network includes the number of antennas at the base stations (BS) or infrastructure, BS transmit power, and routing protocols [33]. The infrastructure can help extend the linear scaling to a larger range of path loss (than just α = 2), depending on the base-station scaling.
Consider now cognitive networks, which contain different types of users with unequal access priority to the network resources. Often, the primary users have higher priority access to the spectrum. The cognitive users, on the other hand, may need to sense their environment and operate on an opportunistic basis in an environment with persistent interference from the primary users. Will this interference affect their throughput scaling? At the same time, they need to operate in a way that guarantees a certain performance for the primary users. We will discuss this latter constraint in the next section on communication theoretic limits, while focusing on the scaling of the throughput of cognitive users in this section.

SINGLE-HOP COGNITIVE NETWORKS WITH CONSTANT POWER
Consider a planar cognitive network that has fixed node densities and size growing with the number of nodes. As a specific instance, we study a circular network with radius R. To scale the number of cognitive and primary users, we let R increase. Other shapes also produce a similar scaling law.
The network model is depicted in Figure 4. Within the network, there are m primary users and n cognitive users. Around each receiver, either primary or cognitive, we assume a protected circle of radius > 0, in which no interfering transmitter may operate. This is a practical constraint to simply ensure that the interfering transmitter and receiver are not located at exactly the same point. Other than the receiver protected regions, the primary transmitters' and receivers' locations are arbitrary, subject to a minimum distance R 0 between any two primary transmitters. This scenario corresponds to a broadcast network, such as the TV or the cellular networks, in which the primary transmitters are BS. The cognitive transmitters, on the other hand, are uniformly and randomly distributed with constant density λ. We assume that each cognitive receiver is within a D max distance from its transmitter. In this section, we consider constant transmit power for the cognitive users. D max is then a constant which can be prechosen to fit a large network size and applied to all networks. (In the next section, we discuss the case in which cognitive users can scale their Assume a large scale network in which channel gains are path-loss dependent only. The channel power gain g between nodes of distance d apart is g = 1/d α , where α > 2 is the power path loss. Assume no cooperation, each user treats unwanted signals from all other users as noise, similar to the interferencelimited regime in previously discussed ad hoc networks.
Consider the transmission rate of each cognitive user. This rate is affected by two factors: the received signal power and the interference power at the cognitive receiver. Because of the bounded Tx-Rx distance D max , if the cognitive user employs single-hop transmission, the received signal power is always above a constant level of P/D α max , where P is the cognitive user's transmit power. We furthermore assume that each cognitive receiver has a protected circle of radius c > 0, in which no interfering transmitter may operate.
The average interference power to any cognitive receiver from all other users is bounded. Specifically, this interference includes that from the primary users and from other cognitive users. Because the primary users are at arbitrary but nonrandom locations, their total interference is deterministic. In the worst case (largest interference) scenario, when these primary users are placed regularly on a hexagon lattice of distance R 0 , their interference is upper bounded by a constant as their number increases (m → ∞). The interference from other cognitive users, however, is random because of their random locations. Nevertheless, their average interference is still bounded.
This bounded average interference, coupled with the constant minimum-received-signal-power, leads to a constant lower bound on the average transmission rate of each cognitive user. With constant transmit power, the transmission rate of each cognitive user is upper bounded by a constant by removing the interference from other cognitive users. Since both the lower and upper bounds to each user's average transmission rate are constant, the average network throughput grows linearly with the number of users.
Further concentration analysis shows that any network realization indeed achieves this linear throughput with high probability. Specifically, this probability approaches one at an exponential rate of exp(−n)/ √ n [34].

SINGLE-HOP COGNITIVE NETWORKS WITH DISTANT-DEPENDENT POWER
Consider as a special case a large network with a single primary user who has the transmitter at the network center and the receiver at some distance R 0 away. The cognitive users can detect the location of, and hence the distance to, the primary transmitter and can then scale their transmit power according to that distance. Specifically, suppose that a cognitive user at distance r transmits with power where P c is a constant. Then, provided that 0 ≤ γ < α − 2, the total interference from the cognitive users to the primary user is still bounded, making the power scaling an attractive option for the cognitive users.
With power-scaling, the maximum distance D max between a cognitive Tx and Rx can now grow with the network size as where r again is the distance from the cognitive transmitter to the primary transmitter and K d is a constant. Thus depending on the path loss α, the cognitive Tx-Rx distance can grow with an exponent of up to 1 − 2/α. For a large α, this growth is almost at the same rate as the network.
The average throughput of the cognitive users may now grow faster than linear. Specifically, using bounding techniques, we can conclude that with positive power scaling (γ > 0), the average throughput of the cognitive users scales at least linearly and at most as n log(n) [34].

MULTIHOP COGNITIVE NETWORKS
If single-hop cognitive networks can achieve a linear cognitive throughput, what can a multihop network achieve? Consider a cognitive network consisting of multiple primary and multiple cognitive users. Both types of users are ad hoc, randomly distributed according to Poisson point processes with different densities. Here there is no restriction on the maximum cognitive Tx-Rx distance, and these cognitive transmitters and receivers form communication pairs randomly, much in the same fashion as a stand-alone ad hoc network.  Then provided that the cognitive node density is higher than the primary node density, using multihop routing, it can be shown that both types of users, primary and cognitive, can achieve a throughput scaling as if the other type of users were not present [35]. Specifically, the throughput of the m primary users scales as m/ log m, and that of the n cognitive users as n/ log n. Furthermore, these throughput scalings are achieved while the primary users maintain their transmission protocols. In other words, the primary users can operate using nearestneighbor forwarding without regards to the presence of the cognitive users. The cognitive users, on the other hand, rely on their higher density and use clever routing to avoid interfering with the primary users while still reaching their own destinations. This cognitive routing protocol consists of a preservation region around each primary node, which the cognitive users must avoid routing through. The cognitive users therefore can still use nearest-neighbor routing while taking care to avoid the preservation regions. Provided an appropriate scaling of the preservation region size with the number of cognitive users, the cognitive users can achieve their throughput scaling as if there were no primary users.

COGNITIVE NETWORKS: MODELS AND DESIGN
In this section, we model and analyze cognitive networks in order to intelligently design and set their operating parameters. Specifically, we focus on the impact of the cognitive users on the primary users in terms of the interference power, or interference temperature generated by these cognitive users at the primary user(s). The interference is of interest in simple, realistic networks, as it directly affects the performance of the primary users. Interference analysis has been studied by a number of authors (see, for example, [7], [8], and [36]). The results can be used to design various network parameters to guarantee certain performance to the primary users. In this section, we aim to provide only an example of this interference analysis and its application in two different network settings: a network with beacon and a network with exclusive regions for the primary users.

INTERFERENCE ANALYSIS
Consider again an extended network in which the cognitive users are uniformly distributed with constant density. Assume a circular network shape with radius R n , which increases as the number of cognitive users increases. Consider a channel with path loss and small-scale fading. The interference depends on the locations of the cognitive users, which are random, and on the random channel fading. Hence this interference is random. Using the constant cognitive user density λ, the average total interference from all n cognitive users to the worst-case primary receiver, which may be shown to be the primary receiver at the center of the circular network, can be computed as [37], where is the receiver-protected radius as discussed in "Small Networks: Achievable Rate and Capacity Regions," R n is the network radius, and P is the cognitive transmit power. Provided the path loss α > 2, then the average interference is bounded, even as the number of cognitive users approaches infinity (n → ∞ or R n → ∞). The variance of the interference can also be analyzed, which is highly dependent on the channel fading and cognitive user spatial distribution.
The average interference can be used to either limit the transmit power of the cognitive users, or to design certain network parameters to limit the interference impact on the primary users. Next, we discuss two examples of how the interference analysis can be applied to design network parameters.

A NETWORK WITH BEACON
In a network with beacon, the primary users transmit a beacon before each transmission. This beacon is received by all users in the network. The cognitive users, upon detecting this beacon, will abstain from transmitting for the next duration. The mechanism is designed to avoid interference from the cognitive users to the primary users.
In practice, however, because of channel fading, the cognitive users may sometimes overlook the beacon. They can then transmit concurrently with the primary users, creating interference. This interference depends on certain parameters, such as the beacon detection threshold, the distance between the primary transmitter and receiver and the receiver protected radius. By designing network parameters, such as the beacon detection threshold, we can control this interference to limit its impact on the primary users' performance.
Assume a simple power detection threshold (which can either be the received beacon power or the power after some processing). Let γ denote ratio between this power threshold   WHILE CAPACITY IS CENTRAL TO  MANY INFORMATION THEORETIC  STUDIES, IT IS OFTEN CHALLENGING  TO DETERMINE. and the beacon transmit power, we will simply call γ as the beacon detection threshold. With Rayleigh channel fading, the probability of a cognitive user missing the beacon depends on this beacon threshold γ and the distance to the primary transmitter d as [37], where again α is the path loss exponent. When a cognitive user misses the beacon, that user may transmit concurrently with the primary user with probability β (also called cognitive user's activity factor). These parameters can be used to bound the generated interference [37].
In particular, the interference bound versus the beacon detection threshold can be graphed as in Figure 6. We see that as the beacon threshold increases, the cognitive users are more likely to miss the beacon and therefore increase the average interference to the primary user. The case when the cognitive transmitters are always transmitting (a beaconless system) corresponds to γ = ∞. This limit is approached quickly for finite values of γ . The convergence rate, however, depends on other parameters such as α, R 0 , and P. Figure 7 shows the plots of this bound versus the primary Tx-Rx distance R 0 (for α = 2.1 and γ = 0.2). The bound is monotonously increasing in R 0 . As R 0 increases, however, the interference upper bound approaches a fixed limit. Since most of the interference comes from the cognitive transmitter close to the primary receiver, when this receiver is far away from the primary transmitter (R 0 is large), then these cognitive users are likely to always miss the beacon and hence create a constant interference level to the primary user.

A NETWORK WITH PRIMARY EXCLUSIVE REGIONS
Another way of limiting the impact of the cognitive users on the primary users is to impose a certain distance from the primary user, within which the cognitive users cannot transmit. This configuration appears suitable to a broadcast network in which there is one primary transmitter communicating with multiple primary receivers. Examples include the TV network or the downlink in the cellular network. In such networks, the primary receivers may be passive devices and therefore are hard to detect by the cognitive users, in contrast to the primary transmitter whose location can be easily inferred. Thus it may be reasonable to place an exclusive radius R 0 around the primary transmitter, within which no cognitive transmissions are allowed. We call this a primary-exclusive region (PER). Such regions has been proposed for the upcoming spectrum sharing of the TV band [38].
Similar to previous networks, we also assume a receiverprotected radius of around each receiver. This implies that any cognitive transmitter must be at least an radius away from a primary receiver. Assuming the location of the primary receiver is unknown to the cognitive users, this condition results in a guard band of width around the PER, in which no cognitive transmitters may operate. The relation between and the PER radius R 0 will be discussed later.
The cognitive transmitters are randomly and uniformly distributed outside the PER and protected band, within a network radius R from the primary transmitter. As the number of cognitive users increases, R increases. The network model is shown in Figure 8. Of interest is how to design the exclusive radius R 0 , given other network parameters, to guarantee certain performance to the primary users. Here we are interested in the primary user's outage capacity-a minimum rate for a certain portion of time, or equivalently, with a certain probability. This outage capacity is the

COGNITIVE NETWORKS ARE WIRELESS NETWORKS THAT CONSIST OF SEVERAL TYPES OF USERS: OFTEN A PRIMARY USER AND SECONDARY USERS.
[FIG8] A cognitive network consists of a single primary transmitter at the center of a PER with radius R 0 , which contains its intended receiver. Surrounding the PER is a protected band of width > 0. Outside the PER and the protected bands, n cognitive transmitters are distributed randomly and uniformly with density λ. data rate at a primary receiver within the PER. Since the receiver can be anywhere within the PER, we need to guarantee the performance for the worst case scenario, in which the primary receiver is at the edge of the PER in a network with an infinite number of cognitive users (R → ∞).
Using the interference power analysis 2, coupled with the outage constraint, we can then derive an explicit relation between R 0 and other parameters including the protected radius , the transmit power of the primary user P 0 and cognitive users P. Assuming that we want to guarantee an outage capacity C 0 to the primary user with the probability β (that means for β fraction of time, the transmission rate of the primary user is at least C 0 ), then the PER radius R 0 must satisfy [39] where again α is the path loss exponent, and λ is the cognitive user density. An example of the relation between R 0 and , as given in (1), is shown in Figure 9 for α = 4. Here C 0 is specified as the fraction of the maximum transmission rate possible for the primary user. The plots show that R 0 increases with , and the two are of approximately the same order. This makes sense since at the primary receiver, there is a tradeoff between the interference seen from the secondary users, which is of a minimum distance away, and the desired signal strength from the primary transmitter, which is of a maximum distance R 0 away. The larger the , the less interference, and thus the further away the primary receiver may lie from the transmitter. The PER radius R 0 , however, decreases with increasing C 0 . This is intuitively appealing since to guarantee a higher capacity, the received signal strength at the primary receiver must increase, requiring the receiver to be closer to the transmitter.
Another tradeoff that (1) reveals is the relation between R 0 and the primary transmitter power P 0 , as shown in Figure 10. We observe the fourth-order increase in power here, which is in line with the path loss α = 4. The figure shows that a small increase in the receiver-protected radius can lead to a large reduction in the required primary transmit power P 0 to reach a receiver at a given radius R 0 while satisfying the given outage constraint.

CONCLUSIONS
We have showcased a number of results relating to the fundamental communication limits in cognitive networks. We first discussed how, for small networks, different levels of cognition, or information about the wireless environment, in the secondary node(s) leads to different achievable rate and capacity regions. In large networks, we provide the throughput scaling law for three cognitive networks. Turning attention to the design of network parameters and communication protocols, the interference seen by the primary receivers from cognitive radios is of great importance. We outlined examples of interference analyses and their impacts in cognitive networks with beacons and with primary exclusive regions. These surveyed results [FIG10] The relation between the BS power P 0 and the exclusive region radius R 0 according to (4) for λ = 1,P = 1, σ 2 = 1, β = 0.1, C 0 = 3 and α = 3. [FIG9] The relation between the exclusive region radius R 0 and the guard band according to (4) for λ = 1, P = 1, P 0 = 100, σ 2 = 1, β = 0.1 and α = 3.