Abstract

This paper describes a Field Programmable Gate Array (FPGA) based 2x2 MIMO testbed which employs a new semi-blind method for channel estimation. The RF part of this system is built using commercially available components and in-house developed antennas while commercially available FPGA hardware is used for fast processing of Intermediate Frequency (IF) signals. The operation of this MIMO testbed is verified at the IF level for an Alamouti transmission scheme using a purpose developed channel emulator.

1. INTRODUCTION

Multiple Input Multiple Output (MIMO) wireless communications is an emerging technology that utilizes multiple element antennas (MEAs) both on transmit and receive sides of the communication link to improve data throughput over the traditional single transmit single receive antenna wireless system (also known as SISO) in a non-line-of-sight environment [1], [2]. Investigating properties of actual MIMO channels and determining optimal transmission schemes under varying channel conditions has been the subject of research in many parts of the world. In order to test the performance of various MIMO transmission schemes in real environments a MIMO testbed is usually employed. MIMO testbeds, such as described in [3-7] aim to measure variables such as the elements of the complex channel matrix, in addition to traditional communication system parameters like bit error rate (BER). The major challenge in the design and development of such systems is handling of an increased data received on multiple channels, which are formed by a multiple element antennas and a scattering environment. A special type of signal processor is required to tackle this task in an efficient manner.

One of the key factors for the proper operation of MIMO system is accurate and fast channel estimation. In one of possible approaches, training sequences, known both to transmitter and receiver, which accompany each information-carrying data packet are used to estimate the channel. The disadvantage of this approach is the reduction of the information throughput. In an alternative approach, blind estimation is used, which does not involve known data sequences. Instead, some properties of the information signal are utilized to estimate the channel. Although this approach aims at avoiding reduction of capacity, in practice, it requires long sequences of data and a time consuming iterative procedure. As a result, this approach may lead to estimation errors and thus to reducing of the system capacity.

In this paper, an FPGA based 2x2 MIMO testbed, which employs an improved channel estimation method, named a semi-blind estimation method, is described. The operation of this MIMO testbed is investigated by assuming Alamouti scheme for space time coding/decoding at transmitter and receiver. This coding/decoding scheme is entirely implemented in FPGA hardware. In order to test various stages of this system, a purpose-developed channel emulator is used. Using the newly proposed method, we re-estimate the channel after each symbol is received. This allows the system to keep accurate estimation for long packet periods. Assuming that the channel matrix changes are gradual, we demonstrate that the new channel estimation method is advantageous in compare with the traditional training sequence and blind estimation channel methods.

2. SYSTEM DESIGN

The investigated MIMO testbed is a 2x2 MIMO system that features two element array antennas, RF hardware and IF signal processing modules. Its schematic including details of various modules is shown in Fig.1 below.

The RF hardware is developed using commercially available amplifiers, oscillators, and RF and IF filters. The operational frequency band is chosen as the 2.45GHz IMS frequency band. The RF front end is formed by components from Mini Circuits. Signal processing is accomplished at the IF using a Field Programmable Gate Array (FPGA). The FPGA selected in this design is a Stratix II FPGA from Altera. Accompanying the FPGA are two high-speed Analogue to Digital converters (ADC), capable of 125MSamples/sec.
and two Digital to Analogue converters (DAC), capable of 165MSamples/sec (14bits) are used. In addition to the above modules, a high speed interface for real time data retrieval and display is available via a 100Mbit Ethernet port. Using this approach, it is possible to demonstrate real-time modulation and demodulation of a Space Time encoded QPSK signal.

For space time coding/decoding at transmitter and receiver the system employs the 2x2 Alamouti Scheme. In order to perform various stages of signal processing the following modules are constructed in the FPGA: 1 transmit module (Space Time Modulator), 1 receive module (Space Time Demodulator including channel matrix $H$ estimator), 1 channel emulator and control circuitry. These are illustrated in Fig. 2. These modules are described in the following.

A. FPGA Transmitter Module

The FPGA transmitter module consists of an IQ mapper and a numerically controlled oscillator (NCO). The IQ mapper encodes the input bit stream into a space-time code and then creates in-phase (I) and quadrature (Q) symbols. The NCO via a look up table of sine values, is able to generate frequencies between 97kHz and 50MHz using the on-board 100MHz crystal clock. Here, it is applied to generate a 6.25MHz waveform, with phase offset as defined by the IQ mapper. The TX module has 4 modes of operation, two channels transmit, a single channel transmits (1 or 2) or no transmission. In normal operation three modes are used, channel 1 only mode, both channel, and no channel transmit mode. This is done in order to allow the training sequence to be received. The inputs to the complete TX module include the bit stream and a mode selector.

B. FPGA Receiver Module Employing Semi-Blind Channel Estimation

The FPGA receiver module consists of 4 main modules, a mixer module, symbol detector, a symbol de-mapper, and a ML estimator and decoder. The mixer module mixes the received signal with a local cosine and sine signal, produced by the NCO. This is then followed by symbol detection which uses the I and Q waveforms obtained from the mixer. An integration process is used to get the I and Q signal values. In digital logic this is very simple, as it involves accumulation over a symbol period. Typically I and Q channel signals are obtained via a low pass filter and sampling. Our approach does not require any multipliers.

The proposed semi-blind channel estimation procedure, which is employed by the receiver is as follows. The symbol de-mapper takes the I and Q values and with the assistance of the ML decoder and estimator decodes, obtains the initial bit sequence transmitted. This is done by assuming an arbitrary channel matrix $H$, and decoding the signal using this. From this the signal is decoded, and then threshold operation is applied. The thresholded symbols are then used to estimate the new channel matrix. In order to minimize the effects of noise, the channel matrix used for the next decoded signal is an average of the previous and new estimated matrix.

One of the important tasks in this module is synchronization. This needs to be accomplished on the symbol level and the data level. On the symbol level symbol change boundaries need to be monitored, and on the data level, the receiver needs to detect the start and end of the training sequence and data transmission.

The training sequence in this design involves first transmitting on only one channel and using BPSK modulation. This allows symbol transitions to be easily detected, and the estimation of $h_{11}$ and $h_{12}$ elements of the complex channel matrix. During the training sequence knowledge of the training sequence is used when the threshold operation is performed on the decoded signals.

Channel estimation of the $H$ matrix uses the Maximum Likelihood (ML) estimation method. It is assumed that during training sequence transmission the channel changes slowly, and hence two pairs of ML estimation (over 4 symbol periods) are calculated. The offsets between consecutive symbol changes are used to predict the start and end of training sequence blocks. The inputs to the ML estimation block are the accumulated I and Q channels and the known symbols. Through a process of 8 complex multiplications (32
real multiplications) each element of the $H$ matrix can be calculated. The output of this block is stored in the $H$ matrix, which is then used as one of the inputs for ML decoding. Signal decoding of the Alamouti based STC signal is performed using the I and Q signal values and the previously estimated $H$ channel. Two consecutive I and Q values are stored for each channel and through a process of 8 complex multiplications (32 real multiplications) the original 2 symbols are determined. These multiplications are done using 4 multiplier blocks and hence are spread over 8 clock cycles in order to save and reuse resources.

The ML estimation and decoding blocks are used together to train the receiving system. When the training sequence is finished, ML estimation is used as a channel matrix $H$ corrector. During transmission, the two symbols outputted from the ML decoders are unmapped from IQ back to bits and reassembled into the bitstream.

C. Real-Time Data Acquisition and Control Circuitry

In order to control the various logic functions, and to allow interactive processing of results, a softcore (produced from logic gates) processor (Nios II) is used. This processor is configured to run at the same clock rate as the other hardware modules (100MHz), and the μCLinux operating system is used. μCLinux is selected due to its advanced networking functions, and flexibility. The processor acts as a gateway between the hardware modules and a PC, via Ethernet and a web based (HTTP) interface. On the PC a web browser with support for the Scalable Vector Graphics (SVG) format is used to interface with the document provided by the embedded web server on the processor.

The NIOS processor interacts with the hardware modules via special buffers. Some of these buffers are used to control the hardware modules, such as the emulated channel matrix $H$, and the bits to be transmitted. The other buffers are used to analyse data as it is processed by various parts of the transmitter and receiver. These data analysis buffers are configured to store 1024 samples of data. Included in this data buffers are the estimated channel matrix $H$, the received bitstream, and some of the internal signals, such as the received signals, I and Q channels and the transmitted signal. Information such as Bit Error Rate (BER), and error in channel estimation is calculated and stored in buffers. Bit Error Rate (BER) is calculated using difference between the transmitted and received bitstreams while the channel matrix error is determined by comparing the estimated channel matrix $H$ against the emulated channel matrix $H$. This approach makes it possible for the software to sequentially fetch synchronized data from various parts of the transmitter and receiver modules.

D. Channel Emulator

In order to test the operation of the proposed MIMO testbed employing which relies on a semi-blind channel estimation procedure, a wireless channel emulator is constructed.

For the 2x2 MIMO system, the channel matrix $H$ representation is used as given in (1).

$$y = Hx + n;$$

$$\begin{bmatrix}
y_1(t) & y_1(t+1) \\
y_2(t) & y_2(t+1)
\end{bmatrix} = \begin{bmatrix} h_{11} & h_{12} \\
h_{21} & h_{22}
\end{bmatrix} \begin{bmatrix} x_1(t) & x_1(t+1) \\
x_2(t) & x_2(t+1)
\end{bmatrix} + \begin{bmatrix} n_1(t) & n_1(t+1) \\
n_2(t) & n_2(t+1)
\end{bmatrix}$$

(1)

where $y$ is the received signal vector, $x$ is the transmitted signal vector, and $n$ is the noise vector. Both $y$ and $x$ have two symbols period for the same channel matrix $H$ due to the application of Alamouti scheme. Real and imaginary parts of the signal need to be known. In order to synthesize the imaginary component, the 90 degree phase delay is used. Due to the real component of the signal being required after the channel emulator, the equations used for this block are shown in (2).

$$y_i(t) = x_i(t)\text{real}(h_{ii}) + x_i(t-d)\text{imag}(h_{ii}) + x_i(t)\text{real}(h_{ij}) + x_i(t-d)\text{imag}(h_{ij})$$

(2)

where $d$ is a 90 degree phase delay of one symbol period. In our case, the symbol period is given by 32 samples, which includes two cycles of sine. Hence $d$ is equal to 4 sample periods.

To emulate the channel matrix $H$ variations, two alternative

---

**International Symposium on Antennas and Propagation — ISAP 2006**
approaches are considered. In the first approach, $H$ is realized by using measured results. In the alternative approach the signal scattering model described in [8] for generating $H$ is applied. For MIMO measurement, vector network analyser (VNA) is used to conduct wireless channel measurements in an indoor environment. These are done at the fifth floor of the Hawken Building at the University of Queensland. The receiver is located in a hall while the transmitter is situated in a laboratory room accommodating VNA. The VNA is calibrated for the $S_{21}$ scattering parameter measurements. In order to boost the dynamic range of VNA, low noise amplifier (LNA) is used at the transmitter side. This approach enables perfect synchronization between transmitted and received signals and allows the measurement of the elements of the channel matrix in direct manner. The experimental configuration is shown in Fig. 3.

The transmitter and receiver are separated with the soft wall including a metal sheet and thus the chosen configuration represents pure NLOS conditions. The data is collected at the center frequency of 2.45GHz over a 200MHz bandwidth. The quarter-wavelength wire monopole antennas designed at 2.45GHz with vertical polarization are used in the 2x2 MIMO measurements. The antenna element spacing is set around 0.5λ. As the pairs of antennas elements are present during measurements, the measured data includes the effect of mutual coupling. The individual channel matrix elements $h_{11}$, $h_{12}$, $h_{21}$ and $h_{22}$ are obtained using VNA by manually switching the transmit/receive antennas of the 2x2 MIMO system. In order to generate a statistical data, 8 receiver locations are recorded and at each location 10 sets of measurements are taken. Next, all 80 channel matrices are saved in terms of real and imaginary parts and transferred from the VNA to the buffer of the FPGA.

For scattering model, the scattering environment is represented by a rectangular region of dimensions 200λ x 200λ with transmitter and receiver equipped with MEA located on opposite sides of the rectangle. 600 scatterers uniformly distributed within the rectangular region are assumed. The validity of using the scattering model has already been confirmed by the data obtained from real channel measurements [8].

The measured or simulated data representing the channel matrix variations is stored into the buffer of the FPGA. For emulation purposes, 1000 different channel matrices represented in terms of real and imaginary parts are used. The noise vector $n$, is generated using a uniformly distributed random number generator to select a random number from a table of pre-generated Gaussian distributed random numbers. Next, the Alamouti scheme for 2x2 MIMO system with the maximum likelihood (ML) channel estimation technique is implemented.

### 3. Validation

The MIMO testbed is validated with respect to IF signal processing. This is done by removing the RF front ends shown in Fig. 1 and by forming a wire connection between TX and RX modules. The channel emulator is introduced after the TX and before the wire. In the test, the data rate is assumed to be 3.125 Mbit per second. The test results are shown in Fig. 4-8. In these figures the signal magnitudes are normalized using fixed point number representation (as implemented in the FPGA). An automatic gain control module applies amplification to the signal, when the signal is considered too small. When the signal goes between A/D converter, D/A converter and wire, a signal loss of around 3.5dB is observed.

In the proposed method, the channel matrix is continually corrected, both in training sequence and in normal data transmission. One important difference is in the training sequence, in that the threshold function is assisted by the known symbols in the training sequence.

The training sequence is also split into multiple sections. The first part uses a signal transmitted from only TX1 (and received on both RX1 and RX2). During this part of the training sequence the signal is only distorted by phase and a decrease in magnitude, and hence initially the problem of symbol synchronization can be solved like in a conventional SISO system. Due to synchronization on the TX1 to RX1 signal (represented by $h_{11}$ in the channel matrix), all terms of the channel matrix $H$ calculated at the receiver site are relative to $h_{11}$. During the next part of the training sequence both transmit channels are used in order to properly estimate the last two elements of the channel matrix $H$.

After proper synchronization obtained with the training sequence, the performance of space-time coding is evaluated under different channel matrix realizations. The received signals under varying channel matrix data are shown in Fig. 4. Another representation is shown in Fig. 5 and 6. The first of these is for an ideal channel (channel matrix $H$ is an identity matrix), and the other one concerns the case when $H$ is generated by the scattering model of [8]. The first step involves decomposing the received signal into I and Q pairs. This is done by integrating over a symbol period, and storing the result. Results shown in Fig. 4-6 indicate that the MIMO system is successfully implemented in FPGA hardware. The input and output bit streams are the same and all signals of I and Q channels are correctly identified.

The system performance depends on its capability of proper estimating the channel matrix $H$. This is demonstrated in the results presented in Fig. 7 where the noise effect on BER performance under the condition of perfect synchronization is studied. Perfect signal synchronization is established through the training sequence and manual intervention. 10,000 bits are randomly sent to the channel emulator and the bit error rate (BER) at the receiving module is recorded. While implementing noise of a certain SNR, each symbol is two periods of a sine wave. Two cases of channel estimation are considered, the first is perfect, and the second uses the ML.
estimation. The results for perfect channel estimation match the known result of a 2x2 STC MIMO system [9].

The web server interface can provide a real time method of data visualization, in terms of bit stream, symbols and raw signal. On average 1024 data points can be plotted every 40ms. This SVG interfaced was used to display the data shown in Fig 4-6.

The performance of this system is a real-time implementation, and since it is produced in FPGA. There is however a latency of 24 clock cycles (240ns) between the time of receiving a STC block and when the bits are ready, due to the way the logic is structured. In terms of resource usage the system takes up 5000 ALUT (advanced lookup-tables) [10%], 16 DSP blocks [5%] and 22,740 [1%] memory bits, of an Altera 2S60 DSP. This system uses slightly less memory bits compared to a system that uses only training sequence to estimate and train the estimated channel.

4. CONCLUSION

In this paper the design and development of a Field Programmable Gate Array (FPGA) based 2x2 MIMO testbed which employs a new semi-blind method for channel estimation has been presented. At the present stage, the 2x2 MIMO system for space time coding Alamouti scheme has been implemented. Its operation has been investigated using a channel emulator. The obtained results indicate that the developed testbed works properly with respect to processing of the IF signals and is ready for performing various tests in actual indoor wireless environments.

ACKNOWLEDGEMENT

The authors acknowledge the financial support of the Australian Research Council via Discovery Project Grant DP0450118.

REFERENCES


