An Earth-sized planet with an Earth-like density

Recent analyses of data from the NASA Kepler spacecraft have established that planets with radii within 25 per cent of the Earth’s () are commonplace throughout the Galaxy, orbiting at least 16.5 per cent of Sun-like stars. Because these studies were sensitive to the sizes of the planets but not their masses, the question remains whether these Earth-sized planets are indeed similar to the Earth in bulk composition. The smallest planets for which masses have been accurately determined are Kepler-10b (1.42) and Kepler-36b (1.49), which are both significantly larger than the Earth. Recently, the planet Kepler-78b was discovered and found to have a radius of only 1.16. Here we report that the mass of this planet is 1.86 Earth masses. The resulting mean density of the planet is 5.57 g cm−3, which is similar to that of the Earth and implies a composition of iron and rock.

Recent analyses [1][2][3][4] of data from the NASA Kepler spacecraft 5 have established that planets with radii within 25 per cent of the Earth's (R › ) are commonplace throughout the Galaxy, orbiting at least 16.5 per cent of Sun-like stars 1 . Because these studies were sensitive to the sizes of the planets but not their masses, the question remains whether these Earth-sized planets are indeed similar to the Earth in bulk composition. The smallest planets for which masses have been accurately determined 6,7 are Kepler-10b (1.42R › ) and Kepler-36b (1.49R › ), which are both significantly larger than the Earth. Recently, the planet Kepler-78b was discovered 8 and found to have a radius of only 1.16R › . Here we report that the mass of this planet is 1.86 Earth masses. The resulting mean density of the planet is 5.57 g cm 23 , which is similar to that of the Earth and implies a composition of iron and rock.
Every 8.5 h, the star Kepler-78 (first known as TYC 3147-188-1 and later designated KIC 8435766) presents to the Earth a shallow eclipse consistent 8 with the passage of an orbiting planet with a radius of 1.16 6 0.19R › . A previous study 8 demonstrated that it was very unlikely that these eclipses were the result of a massive companion either to Kepler-78 itself or to a fainter star near its position on the sky. Judging from the absence of ellipsoidal light variations 8 of the star, the upper limit on the mass of the planet is 8 Earth masses (M › ). In addition to its diminutive size, the planet Kepler-78b is interesting because the light curve recorded by the Kepler spacecraft reveals the secondary eclipse of the planet behind the star as well as the variations in the light received from the planet as it orbits the star and presents different hemispheres to the observer. These data enabled constraints 8 to be put on the albedo and temperature of the planet. A direct measurement of the mass of Kepler-78b would permit an evaluation of its mean density and, by inference, its composition, and motivated this study.
The newly commissioned HARPS-N 9 spectrograph is the Northern Hemisphere copy of the HARPS 10 instrument, and, like HARPS, HARPS-N allows scientific observations to be made alongside thorium-argon emission spectra for wavelength calibration 11 . HARPS-N is installed at the 3.57-m Telescopio Nazionale Galileo at the Roque de los Muchachos Observatory, La Palma Island, Spain. The high optical efficiency of the instrument enables a radial-velocity precision of 1.2 m s 21 to be achieved in a 1-h exposure on a slowly rotating late-G-type or K-type dwarf star with stellar visible magnitude m v 5 12. By observing standard stars of known radial velocity during the first year of operation of HARPS-N, we estimated it to have a precision of at least 1 m s 21 , a value which is roughly half the semi-amplitude of the signal expected for Kepler-78b should the planet have a rocky composition. We began an intensive observing campaign (Methods) of Kepler-78 (m v 5 11.72) in May 2013, acquiring HARPS-N spectra of 30-min exposure time and an average signal-to-noise ratio of 45 per extracted pixel at 550 nm (wavelength bin of 0.00145 nm). From these high-quality spectra, we estimated 12,13 the stellar parameters of Kepler-78 (Methods and Extended Data Table 1). Our estimate of the stellar radius, R Ã~0 :737 z0:034 {0:042 R 8 , is more accurate than any previously known 8 and allows us to refine the estimate of the planetary radius.
In the Supplementary Data, we provide a table of the radial velocities, the Julian dates, the measurement errors, the line bisector of the cross-correlation function, and the Ca II H-line and K-line activity indicator 14 , log(R9 HK ). The radial velocities (Fig. 1) show a scatter of 4.08 m s 21 and a peak-to-trough variation of 22 m s 21 , which exceeds the estimated average internal (photon-noise) precision, of 2.3 m s 21 . The excess scatter is probably due to star-induced effects including spots and changes in the convective blueshift associated with variations in the stellar activity. These effects may cause an apparent signal at intervals corresponding to the stellar rotational period and its first and second harmonics. To separate this signal from that caused by the planet, we proceeded to estimate the rotation period of the star from the de-trended light curve from Kepler (Methods). Our estimate, of The raw radial-velocity dispersion is 4.08 m s 21 , which is about twice the photon noise. JD, Julian date. ,12.6 d, is consistent with a previous estimate 8 . The power spectral density of the de-trended light curve also shows strong harmonics at respective periods of 6.3 and 4.2 d. We note that these timescales are much longer than the orbital period of the planet.
To estimate the radial-velocity semi-amplitude, K p , due to the planet, we proceeded under the assumption that K p was much larger than the change in radial velocity arising from stellar activity during a single night. This is a reasonable assumption, because a typical 6-h observing sequence spans only 2% of the stellar rotation cycle. Furthermore, the relative phase between the stellar signal and the planetary signal changes continuously, and so we expect that the contribution from stellar activity should average out over the three-month observing period. Assuming a circular orbit, we modelled the radial velocity of the ith observation (gathered at time t i on night d) as v i,d 5 v 0,d -K p sin(2p(t it 0 )/P), where t 0 is the epoch of mid-transit and P is the orbital period, each held fixed at the values derived from the photometry, and v 0,d is the night-d zero point (an offset value estimated independently for each night). We solved for the values of K p and v 0,d by minimizing the x 2 function (least-squares minimization), assuming white noise and weighting the data according to the inverse variances derived from the HARPS-N noise model. This is a technique similar to the one used in a recent study 15 of another low-mass transiting planet, CoRoT-7b.
With this technique, we measure a preliminary value of K p 5 2.08 6 0.32 m s 21 , implying a detection significance of 6.5s. We confirmed that the radial-velocity signal is consistent with the photometric transit ephemeris by repeating the x 2 optimization over a grid of orbital frequencies and times of mid-transit (Fig. 2a). This confirms that the values of P and t 0 from the Kepler light curves coincide with the lowest x 2 minimum in the resulting period-phase diagram.
The Kepler light curve evolves on a timescale of weeks. We therefore expect the stellar-activity-induced radial-velocity signals to remain coherent on the same timescale. To explore this, we fitted the radial velocities with a sum of Keplerian models at different periods, one of which we expect to correspond to the planetary orbit and the others (which we left as free parameters) to represent the effects of stellar activity. Using the Bayesian information criterion as our discriminant (Methods), we found that a model with three Keplerian functions was sufficient to explain the data. The 4.2-d period of the second Keplerian function corresponds to the second harmonic of the photometrically determined stellar rotation period. The period of 10 d for the third Keplerian also appears as a prominent peak in the generalized Lomb-Scargle analysis 16 of the radial-velocity data. Strong peaks at similar periods are also present in periodograms of the Ca II HK activity indicator, the line bisector and the full-width at half-maximum of the cross-correlation function 11 (Extended Data Fig. 1). We conclude that both the 4.2-d and 10-d signals probably have stellar causes.
The best three-Keplerian fit of the data yields an estimate of K p 5 1.96 6 0.32 m s 21 and residuals with a dispersion of 2.34 m s 21 , very close to the internal noise estimates. In Fig. 2b, we show the phasefolded radial velocities after removal of the stellar components, plotted along with the best-fit Keplerian at the planetary orbital period. The orbital parameters are given in Table 1. Having settled on the threecomponent model, we estimated the planetary mass and density by conducting a Markov chain Monte Carlo (MCMC) analysis (Methods). In this analysis, we adopted previously published 8 values of P and t 0 as Gaussian priors. We replaced the planet radius, R p , with the published estimate 8 of the ratio k~R p R Ã and our determination of R Ã . The planetary radius then becomes an output of the MCMC analysis. By adopting the mode of the distributions, we find a planet mass of m p~1 :86 z0:

RESEARCH LETTER
the uncertainty in the stellar mass. The uncertainty in the density is dominated by the uncertainty in k. Our values for m p and r p are consistent with those from an independent study 17 .
In terms of mass, radius and mean density, Kepler-78b is the most similar to the Earth among the exoplanets for which these quantities have been determined. We plot the mass-radius diagram in Fig. 3. By comparing our estimates of Kepler-78b with theoretical models 18 of internal composition, we find that the planet has a rocky interior and most probably a relatively large iron core (perhaps comprising 40% of the planet by mass). We note that in the part of the mass-radius diagram where Kepler-78b lies, there is a general agreement between models and little or no degeneracy. The extreme proximity of the planet to its star, resulting in a high surface temperature and ultraviolet irradiation, would preclude there being a low-molecular-weight atmosphere: any water or volatile envelope that Kepler-78b might have had at formation should have rapidly evaporated 19 . Kepler-78b is also similar to larger high-density, hot exoplanets (Kepler-10b (ref. 6), Kepler-36b (ref. 7) and ), in that in the mass-radius diagram it is not below the lower envelope of mantle-stripping models 21 that tend to enhance the fraction of the planet's iron core. At present, Kepler-78b is the extrasolar planet whose mass, radius and likely composition are most similar to those of the Earth. However, it differs from the Earth notably in its very short orbital period and correspondingly high temperature.
The observations of Kepler-78 have shown the potential of the muchanticipated HARPS-N spectrograph. It will have a crucial role in the characterization of the many Kepler planet candidates with radii similar to that of the Earth. By acquiring and analysing a large number of precise radial-velocity measurements, we can learn whether Earth-sized planets (typically) have Earth-like densities (and, by inference, Earthlike compositions), or whether even small planets have a wide range of compositions, as has recently been established 22,23 for their larger kin.

METHODS SUMMARY
In the case of Kepler-78, the planet-induced radial-velocity variation is small compared with the stellar jitter. If their periodicities are very different, however, it is easy to de-correlate the signals and determine the radial-velocity amplitude of the planet. We used the Kepler light curve of Kepler-78 to measure the stellar rotational period. After de-trending 24,25 the photometry, we computed its power spectral density, which immediately revealed excess power at a period of 12.6 d, the rotational period of the star.
The HARPS-N observations of Kepler-78 yielded not only radial velocities but also high-resolution spectra, which we combined into a spectrum with a high signalto-noise ratio. By applying the stellar parameter classification pipeline 12 , we derived precise stellar parameters. In particular, we re-determined the stellar radius to be R Ã~0 :737 z0:034 {0:042 R 8 , with smaller uncertainties than in the value in the discovery paper 8 .
For the purpose of measuring the signal induced by Kepler-78b, several models can be applied to the data, which may all lead to similar results. However, not all of the models represent the data with the same quality. We therefore used the Bayesian information criterion 26-28 to determine which model matches the data best. This analysis led us to select a three-Keplerian model with two sinusoids (zero-eccentricity Keplerians) for the planet and the 4.2-d stellar signal, respectively, and one Keplerian with non-zero eccentricity for the stellar signal at 10 d.
Once a model has been selected, it is adjusted by a least-squares fit to the data. This approach leads to the maximum-likelihood solution but does not provide all statistically relevant solutions and the distributions of their parameters. We used an MCMC analysis to determine the distribution of all orbital and planetary parameters, in particular the planetary mass and density, and to determine their respective errors.
Online Content Any additional Methods, Extended Data display items and Source Data are available in the online version of the paper; references unique to these sections appear only in the online paper.  Supplementary Information is available in the online version of the paper.

METHODS
Photometric determination of stellar rotational period of Kepler-78. In ref. 8, the Kepler light curve of Kepler-78 was analysed and was de-trended using the PDC-MAP algorithm (Extended Data Fig. 2), which preserves stellar variability 24,25 . The light curve displays clear rotational modulation with a peak-to-valley amplitude that varies between 0.5% and 1.5%, and a period of 12.6 6 0.3 d. We confirmed the rotational period by computing the autocorrelation function of the PDC-MAP light curve: Using a fast Fourier transform we compute the power spectral density from which the autocorrelation function (ACF) is obtained using the inverse transform. We immediately derive a rotational period of 12.6 d (Extended Data  Fig. 3a). The amplitudes of successive peaks decay on an e-folding timescale of about 50 d, which we attribute to the finite lifetimes of individual active regions. The power density distribution in Extended Data Fig. 3b finally shows a peak at the stellar rotational period as well as at its first and second harmonics. The main signal at period P 5 0.355 d of Kepler-78b, as well as its harmonics, are easily identified at shorter periods. HARPS-N observations and stellar parameters. To explore the feasibility of the programme, we performed five hours of continuous observations during a first test night in May 2013. This test night allowed us to determine the optimum strategy and to verify whether the measurement precision was consistent with expectations. Indeed, 12 exposures, each of 30 min, led to an observed dispersion of the order of 2.5 m s 21 , close to the expected photon noise. We therefore decided to dedicate six full HARPS-N nights to the observation of Kepler-78b in June 2013. Given the excellent stability of the instrument (typically less than 1 m s 21 during the night) and the faintness of the star, we observed without the simultaneous reference source 10,11 that usually serves to track potential instrumental drifts. Instead, the second fibre of the spectrograph was placed on the sky to record possible background contamination during cloudy moonlit nights. Owing to excellent astroclimatic conditions, we gathered a total of 81 exposures, each of 30 min, free of moonlight contamination and with an average signal-to-noise ratio (SNR) of 45 per extracted pixel at a wavelength of l 5 550 nm. An extracted pixel covers a wavelength bin of 0.000145 nm. A first analysis of these observations confirmed the presence of the planetary signal. However, it also confirmed that, as suggested in ref. 8, the stellar variability induces radial-velocity variations much larger than the planetary signal, although on very different timescales. To consolidate our results and improve the precision of our planetary-mass measurement, we decided to perform additional observations during the months of July and August 2013. We preferred, however, to observe Kepler-78 only twice per night, around quadrature (at maximum and minimum expected radial velocity), to minimize observing time and to maximize the information on the amplitude. This strategy allowed us to determine the (low-frequency) stellar contribution as the sum of the two nightly measurements and the (highfrequency) planetary signal as the difference between them. We finally obtained a total of 109 high-quality observations over three months, with an average photonnoise-limited precision of 2.3 m s 21 .
The large number of high-SNR spectra gathered by HARPS-N allowed us to redetermine the stellar parameters using the stellar parameter classification (SPC) pipeline 12 . Each high-resolution spectrum (R 5 115,000) yields an average SNR per resolution element of 91 in the MgB region. The weighted average of the individual spectroscopic analyses resulted in final stellar parameters of T eff 5 5058 6 50 K, log(g) 5 4.55 6 0.1, [m/H] 5 20.18 6 0.08 and vsin(i) 5 2 6 1 km s 21 , in agreement, within the uncertainties, with the discovery paper. The value for vsin(i) is, however, poorly determined by SPC. Therefore, we adopted an internal calibration based on the full-width at half-maximum of the cross-correlation functions to compute the projected rotational velocity, which yielded vsin(i) 5 2.8 6 0.5 km s 21 . We note that, assuming spin-orbit alignment, the rotational velocity and our estimate of the stellar radius yield a rotational period of 13 d. This value is in agreement with the stellar rotational period determined from photometry.
The stellar parameters from SPC 12 have been input to the Yonsei-Yale stellar evolutionary models 13 to estimate the mass and radius of the host star. We obtain M Ã 5 (0.758 6 0.046)M 8 for the stellar mass and R Ã~0 :737 z0:034 {0:042 R 8 for the radius, in agreement, within the uncertainties, with the discovery paper. The Ca II HK activity indicator is computed by the online and automatic data-reduction pipeline, which gives an average value of log(R9 HK ) 5 24.52 when using a colour index of B-V 5 0.91 for Kepler-78. The stellar parameters are summarized in Extended Data Table 1. Radial-velocity model selection. It is interesting to note that the signature of Kepler-78b can be retrieved despite the large stellar signals superimposed on the radial velocity induced by the planet. To demonstrate this, we adjusted the data with a simple model consisting of a cosine and the star's systemic velocity, while fixing the period and time of transit to the published values 8 . We compared the results of this simple model with a simple constant using the Bayesian information criterion 26-28 (BIC). We derived the relative likelihood of the two models, also called the evidence ratio, to be e { 1=2 ð ÞDBICi~4 |10 {10 . This first estimate tells us that our cosine model is much superior to the simple constant. In other words, we can say that we have a clear detection of a signal of semi-amplitude K p 5 1.88 6 0.47 m s 21 .
Although certainly biased owing to the lack of stellar activity de-trending, the result provides a confirmation of the existence of Kepler-78b and a first estimation of its mass.
To model the stellar signature, we followed two different approaches. The first one consists of removing any stellar effect occurring on a timescale longer than 2 d by adjusting nightly offsets to the data. This method has the main advantage of not relying on any analytical model and it overcomes the difficulty of modelling nonstationary processes that often characterize stellar activity. The approach is also well suited to our problem because the period of the planet is very short. Its only drawback comes from the large number of additional parameters (21 offsets, one per night), which is a direct consequence of our observing strategy. The second approach consists of modelling the stellar activity as a set of sinusoids or Keplerian functions. This approach makes sense provided that spot groups and plages are coherent on a timescale similar to the radial-velocity observation time span. For Kepler-78, the ACF of the light curve shows a 1/e de-correlation of ,50 d (Extended Data Fig. 3a), which compares well with the 97-d time span of the HARPS-N observations.
In total, we studied a series of more than 30 different models of different complexity. We have compared these models using the BIC 28 evidence ratio, ER, and the BIC weight, w, to find the best few models: ð ÞDBICi Of all the models we considered, two are statistically much more significant.  Table 3, we present the distribution of the parameters of the best model that results from our MCMC analysis (see below). Furthermore, Extended Data Fig. 4 shows the periodogram of the radial-velocity residuals after subtracting the stellar components. The planetary signal is now detected with a false-alarm probability significantly lower than 1%. MCMC analysis. To retrieve the marginal distribution of the true mass of the planet and its density, we carry out an MCMC analysis based on the model selection process described in the previous section. We sample the posterior distributions using an MCMC with the Metropolis-Hastings algorithm. Because the model is very well constrained by the data, the MCMC starts from the solution corresponding to the maximum likelihood, and the MCMC parameter steps correspond to the standard deviation of the adjusted parameters. An acceptance rate of 25% is chosen. To obtain the best possible end result, we take as priors the transit parameters of Kepler-78b (ref. 8). Symmetric distributions are considered to be Gaussian, whereas asymmetric ones, such as that of the orbital inclination, are modelled by split-normal distributions using the published value of the mode of the distribution. We re-derive the radius of the planet using our improved stellar radius estimation and the planet-tostar radius ratio from the Kepler photometry 8 . All other parameters have uniform priors except for the period P, for which a modified Jeffrey's prior is preferred 29 . We use ffiffi e p cos v ð Þ and ffiffi e p sin v ð Þ as free parameters, which translate into a uniform prior in eccentricity 30 . The mean longitude, l 0 , computed at the mean date of the observing campaign, is also preferred as a free parameter. It has the advantage of not being degenerate for low eccentricities, whereas our choice for the reference epoch, T 0 , reduces correlations between adjusted parameters. In this analysis, the MCMC has 2,000,000 iterations and converges after a few hundred iterations. The ACF of each parameter is computed to estimate the typical correlation length of our chains and to estimate a sampling interval to build the final statistical sample. All ACFs have a very short decay (1/e decay after 100 iterations and 1/100 decay after 300 iterations) and present no correlations on a larger iteration lag. We build our final sample using the 1/e-decay iteration lag, which is a good compromise between the size of the statistical sample and its de-correlation value. The final statistical samples consist of 20,000 elements, from which orbital elements and confidence intervals are derived. The resulting orbital elements for Kepler-78b are listed in Extended Data Table 3. The results for the mass, radius and density of the planet are given in Extended Data Table 4, and the distributions for the mass and the density are plotted in Extended Data Fig. 5. These distributions are smoothed for better rendering.