Faculty of technology and science -...
Transcript of Faculty of technology and science -...
Faculty of technology and science
Analysis of Fix‐point Aspects for Wireless Infrastructure Systems
Robin Englund Andreas Grill
Degree Project in Electrical Engineering
Spring 2009
Analysis of Fix‐point Aspects for Wireless Infrastructure Systems
Robin Englund Andreas Grill
Degree Project
Electrical Engineering programme
Spring 2009
Mentors:
Bengt Hallinger, Tieto Arild Moldsvor, Karlstad University
This 22.5 ECTS points Degree Project is part of the 3 year, 180 ECTS points Electrical Engineering programme at Karlstad University, Sweden.
This thesis is submitted in partial fulfillment of the requirements for
the degree in Electrical Engineering. All material in this thesis which is
not my own work has been identified and no material is included for
which a degree has previously been conferred.
Date Robin Englund
Date Andreas Grill
Approved,
Date Mentor: Arild Moldsvor
Date Examiner: Peter Röjder
Abstract
A large amount of today’s telecommunication consists of mobile and short distance wireless applications, where the effect of the channel is unknown and changing over time, and thus needs to be described statistically. Therefore the received signal can not be accurately predicted and has to be estimated. Since telecom systems are implemented in real‐time, the hardware in the receiver for estimating the sent signal can for example be based on a DSP where the statistic calculations are performed. A fixed‐point DSP with a limited number of bits and a fixed binary point causes larger quantization errors compared to floating point operations with higher accuracy.
The focus on this thesis has been to build a library of functions for handling fixed‐point data. A class that can handle the most common arithmetic operations and a least squares solver for fixed‐point have been implemented in MATLAB code.
The MATLAB Fixed‐Point Toolbox could have been used to solve this task, but in order to have full control of the algorithms and the fixed‐point handling an independent library was created.
The conclusion of the simulation made in this thesis is that the least squares result are depending more on the number of integer bits then the number of fractional bits.
I
Sammanfattning
En stor del av dagens telekommunikation består av mobila trådlösa kortdistanstillämpningar där kanalens påverkan är okänd och förändras över tid. Signalen måste därför beskrivas statistiskt, vilket gör att den inte kan bestämmas exakt, utan måste estimeras. Eftersom telekomsystem arbetar i realtid består hårdvaran i mottagaren av t.ex. en DSP där de statistiska beräkningarna görs. En fixtals DSP har ett bestämt antal bitar och fast binärpunkt, vilket introducerar ett större kvantiseringsbrus jämfört med flyttalsoperationer som har en större noggrannhet.
Tyngdpunkten på det här arbetet har varit att skapa ett bibliotek av funktioner för att hantera fixtal. En klass har skapats i MATLAB‐kod som kan hantera de vanligaste aritmetiska operationerna och lösa minsta‐kvadrat‐problem.
MATLAB:s Fixed‐Point Toolbox skulle kunna användas för att lösa den här uppgiften men för att ha full kontroll över algoritmerna och fixtalshanteringen behövs ett eget bibliotek av funktioner som är oberoende av MATLAB:s Fixed‐Point Toolbox.
Slutsatsen av simuleringen gjord i detta examensarbete är att resultatet av minsta‐kvadrat‐metoden är mer beroende av antalet heltalsbitar än antalet binaler.
II
Table of Contents
1 INTRODUCTION 1
2 THEORY 2
2.1 Principles of telecommunication 2
2.2 Fixed‐point 2
2.3 Least squares solution 3
2.4 MATLAB 3
2.5 Fixed‐Point Toolbox 4
3 MATLAB CODE 5
3.1 Class properties 6
3.2 Main function 8
3.3 Input validation 8
3.4 Rounding and overflow 9
3.5 Property update 10
3.6 Arithmetic operations and solvers 10
3.7 Solving least squares problems 12
3.8 Integer controls 13
3.9 Miscellaneous functions 14
3.10 Test program 15
4 SIMULATIONS 16
5 CONCLUSIONS 26
6 REFERENCES 27
Appendix A: Glossary
Appendix B: MATLAB code – Class
Appendix C: MATLAB code – Test program
Appendix D: MATLAB code – Lssolve simulation
Appendix E: MATLAB code – Graph printer
Appendix F: MATLAB code – Lssolve float
III
IV
List of Figures and Tables Figure 2.1: Principles of telecommunication. 2
Figure 2.2: Binary representation of the decimal value 18.25. 2
Figure 2.3: Illustration of the least squares problem. 3
Figure 2.4: Creation of a fixed‐point object, fi, with default properties. 4
Figure 3.1: Creation of a fixed‐point object, fp, with default properties. 5
Table 3.1: Description of the class properties. 6
Table 3.2: Default values for the class properties. 7
Table 3.3: Description of the main function. 8
Table 3.4: Description of input validation functions. 8
Table 3.5: Description of rounding and overflow functions. 9
Table 3.6: Description of property update and min/max functions. 10
Table 3.7: Description of the arithmetic functions. 11
Table 3.8: Description of the multiplication and division solver functions. 11
Table 3.9: Description of functions for solving least squares problems. 12
Table 3.10: Intern functions for solving the lssolve and qr functions. 12
Table 3.11: Description of the integer controls functions. 13
Table 3.12: Description of the miscellaneous functions. 14
Figure 3.2: Test program, error message: Mrdivide matrix / scalar. 15
Table 4.1: Data statistics from simulation. 17
Figure 4.1: Simulation for bit width 24 and fractional width 0. 19
Figure 4.2: Simulation for bit width 24 and fractional width 4. 20
Figure 4.3: Simulation for bit width 24 and fractional width 8. 21
Figure 4.4: Simulation for bit width 24 and fractional width 12. 22
Figure 4.5: Simulation for bit width 24 and fractional width 16. 23
Figure 4.6: Simulation for bit width 32 and fractional width 16. 24
Figure 4.7: 3D‐illustration for varying bit widths and fractional widths. 25
1 Introduction
A large amount of today’s telecommunication consist of mobile and short distance wireless applications, where the effect of the channel* is unknown and changing, and thus needs to be described statistically. Therefore the received signal can not be accurately predicted. Since telecom systems are implemented in real‐time, the hardware in the receiver for estimating the sent signal is based on a DSP. A fixed‐point DSP with a limited number of bits and a fixed radix point causes larger quantization errors compared to floating point operations with higher accuracy.
The task for this thesis is to create a simulator for fixed‐point used to compare the accuracy when selecting different number of bits. The simulator should resemble a DSP and is specified to be implemented in MATLAB [1] code, which in this thesis is used as an object handling language like C++. It should be able to handle the most common mathematical operations for real and complex numbers, in scalars, vectors and matrices.
The MATLAB Fixed‐Point Toolbox [2] could be used to solve this task, but in order to have full control of the algorithms and the fixed‐point handling, a standalone library is needed which is independent of the Fixed‐Point Toolbox.
As a main reference to the implemented program the Fixed‐Point Toolbox is used. The Fixed‐Point Toolbox will be used as a reference for the calculations and a template for the interface.
* Words in blue are explained in Glossary, see Appendix A.
1
2 Theory
2.1 Principles of telecommunication
The basic elements of a telecommunication system are a transmitter, a channel and a receiver. In reality the signal has to be modulated to be able to be sent over the channel. In analog systems the two most common modulation techniques is AM and FM. For digital systems the two most common techniques are PSK and QAM. At the receiver the signal is then demodulated to recover the original sent signal, see figure 2.1.
Figure 2.1: Principles of telecommunication.
A problem in the system in figure 2.1 is that the affect of the channel is unknown; therefore the received signal cannot be accurately predicted.
Another problem is introduced when implementing the system in a fixed‐point DSP. The problem is that the bit width is constant (fixed) which leads to a precision loss due to rounding and overflow. The difference between the actual signal and the quantized signal is called quantization error. This problem also occurs in other embedded systems including a DSP, like a TV or an MP3 player.
2.2 Fixed‐point
In fixed‐point the binary point has a fixed position compared to floating point [3] where the binary point can “float”. The binary point separates the integer part from the fractional part. In figure 2.2 it is shown how the decimal number 18.25 is represented binary with a fixed bit width. To be able to represent negative values the first bit must be allocated to define if the number is positive or negative, in figure 2.2 the msb is set to 0, which makes the value positive.
Figure 2.2: Binary representation of the decimal value 18.25.
When a value has to be rounded to fit the bit width a precision loss emerges. For example the decimal value 18.32 is rounded to 18.5 when using one fractional bit respective 18.25 when using two fractional bits.
2
In a fixed‐point DSP fixed‐point numbers are used to decrease the execution time. The alternative is to use a floating point DSP which has longer execution times. The drawback with a fixed‐point DSP is the precision loss due to the fixed position of the binary point.
2.3 Least squares solution
A system of linear equations can be used to fit a linear mathematical model to some measured data points. A linear system is written as Ax = b, where A is the matrix containing the coefficients, b is the result vector containing the measured values and x is the unknown parameters, which has to be determined. When a system is overdetermined there are more rows than columns in the matrix, i.e. more equations than unknown parameters. Those systems can in general not be solved exactly. One of the most common methods for solving this problem is to determine the least squares solution. The least squares problem can be illustrated as in figure 2.3 where the principle is to determine x so that the shortest distance between the vectors Ax and b is found. This is achieved when the angle between the dashed line and the vector Ax is 90 degrees.
Figure 2.3: Illustration of the least squares problem.
To solve least squares problems there are several different methods. The method that will be used in this thesis is the Householder QR decomposition [5]. Here the matrix A is factorized in A = QR where Q is an orthogonal matrix and R is an upper triangular matrix. The solution is then given by backward substitution.
2.4 MATLAB
MATLAB is a high‐level technical computing language and interactive environment for algorithm development, data visualization, data analysis and numeric computation. It can be used in a wide range of applications such as signal and image processing. Add‐on toolboxes can be used to extend MATLAB [1] for special purposes like the Fixed‐Point Toolbox [2], which will be used as a reference to the program.
3
2.5 Fixed‐Point Toolbox
The Fixed‐Point Toolbox extends MATLAB to provide fixed‐point data types and arithmetic operations. It can be used to design fixed‐point algorithms that can be executed in compiled C‐code speed. The Fixed‐Point Toolbox can e.g. be used for generating test sequences to verify software and hardware implementations. In figure 2.4 a fixed‐point object, fi, is created with the value π. [2]
Figure 2.4: Creation of a fixed‐point object, fi, with default properties.
4
3 MATLAB code
The program is implemented in MATLAB code. The Fixed‐Point Toolbox is used as a reference for the calculations and a template for the interface.
The program consists of a class called fp which holds a number of properties and functions see figure 3.1. The properties are divided into three categories with different attributes. In the first category are the default properties, which are constant and hidden to the user, i.e. they are predefined and cannot be modified. The properties in the second category are private and shown to the user. In the third and last category the properties are private and hidden to the user.
Figure 3.1: Creation of a fixed‐point object, fp, with default properties.
The class also consists of functions that are divided into three methods with different attributes. Some functions can be called by the user and some are private and can only be called by the class functions.
When a new object is created the input value is given in a float, which means that it can contain fractional bits. This value is then rounded and saturated to fit the given bit width and fractional width. The value is then recalculated and stored in the property data. The recalculation is done by moving the binary point n steps to the right where n is the fractional width, this makes data to an integer value, see equation (3.1). All further operations is done to the data property, the float property is only used for representation and only calculated when needed, see equation (3.2).
data = input ∙ 2fracwidth (3.1)
float = data ∙ 2‐fracwidth (3.2)
5
3.1 Class properties
The class consists of several properties, see table 3.1. Properties that are not entered when creating a new object are set to default values, see table 3.2.
Table 3.1: Description of the class properties.
Property Description
data Stores the input value as an integer.
float Stores a float value calculated from the data value combined with the fracwidth, see (3.2).
signed Determines if the value can be a negative value.
1 ‐ Signed
0 ‐ Unsigned
bitwidth Determines by how many bits the value should be represented.
fracwidth Determines by how many fractional bits the value should be represented.
roundmode Determines which type of rounding should be used when handling the object.
Nearest ‐ Rounds data to nearest integer, with ties rounding toward positive infinity.
Round ‐ Rounds data to nearest integer, with ties rounding toward positive respective negative infinity.
Up ‐ Rounds data up to nearest integer.
Down ‐ Rounds data down to nearest integer.
Truncate ‐ Rounds data towards zero, i.e. truncation.
6
Property Description
overflowmode Determines which type of overflow should be used when handling the object.
Saturate ‐ Sets data to minimum respective maximum value if outside the data range.
Wrap ‐ Wraps data if outside the data range, i.e. if the value becomes greater than the maximum value it restarts from the minimum value.
Table 3.2: Default values for the class properties.
Property Default value
data 0
signed 1
bitwidth 8
fracwidth 2
roundmode ‘nearest’
overflowmode ‘saturate’
7
3.2 Main function
In table 3.3 the main function is given.
Table 3.3: Description of the main function.
Function Description
fp The class constructor, which runs when the object is created. The constructor has the same name as the class and the only output arguments is the new object. Appendix B, row 89.
3.3 Input validation
In table 3.4 private functions for reading and validation of the input is given.
Table 3.4: Description of input validation functions.
Function Description
read_input Reads the input when a new object is created, properties that are not entered are set to default values. Appendix B, row 686.
check_input Checks if input is a valid property, if not, the property is set to default value. Appendix B, row 751.
8
3.4 Rounding and overflow
In table 3.5 private functions for rounding and overflow of the value is given.
Table 3.5: Description of rounding and overflow functions.
Function Description
round_over Rounds and saturates the value by calling the functions check_roundmode and check_overflowmode. Returns the value divided into real and imaginary parts. Appendix B, row 797.
round_over_comp Rounds and saturates the value by calling the functions check_roundmode and check_overflowmode. Returns the complex value. Appendix B, row 804.
check_roundmode Checks the rounding mode in the properties of the object and calls the corresponding rounding function. Appendix B, row 812.
check_overflowmode Checks the overflow mode in the properties of the object and calls the corresponding overflow function. Appendix B, row 831.
round_nearest Rounds data to nearest integer, with ties rounding toward positive infinity. Appendix B, row 927.
round_round Rounds data to nearest integer, with ties rounding toward positive respective negative infinity. Appendix B, row 959.
round_up Rounds data up to nearest integer. Appendix B, row 967.
round_down Rounds data down to nearest integer. Appendix B, row 974.
round_truncate Rounds data towards zero, i.e. truncation. Appendix B, row 981.
saturate Sets data to minimum respective maximum value if outside the data range. Appendix B, row 990.
wrap Wraps data if outside the data range. Appendix B, row 1022.
9
3.5 Property update
In table 3.6 private functions for property update and min/max value calculation is given.
Table 3.6: Description of property update and min/max functions.
Function Description
update Updates the object properties. Appendix B, row 773.
set_minmax Sets the properties for minimum and maximum value, calculated depending on the bit width. Appendix B, row 784.
3.6 Arithmetic operations and solvers
The arithmetic functions are called upon any combination of an object and a float. Both the object and the float can independently be a scalar, vector or a matrix. All these arithmetic operations are only possible if the operands have the right sizes. For example, when using multiplication the number of columns in the first matrix must be the same as the number of rows in the second matrix.
In a DSP there are only a limited number of bits to store the data, because of this the DSP is forced to round and saturate. This is done to both the incoming values and after every step in the calculation. The arithmetic functions are adapted to achieve this result and are shown in table 3.7.
10
Table 3.7: Description of the arithmetic functions.
Function Description
plus Overloads the default + operator. Appendix B, row 103.
minus Overloads the default – operator. Appendix B, row 119.
mtimes Overloads the default * operator. Appendix B, row 135.
times Overloads the default .* operator. Appendix B, row 151.
mrdivide Overloads the default / operator. Division with 0 results in 1. Appendix B, row 167.
rdivide Overloads the default ./ operator. Division with 0 results in 1. Appendix B, row 179.
mpower Overloads the default ^ operator. Appendix B, row 193.
ctranspose Overloads the default ' operator. Appendix B, row 204.
transpose Overloads the default .' operator. Appendix B, row 211.
The functions in table 3.8 are private and execute the calculations for the multiplication and division functions in table 3.7.
Table 3.8: Description of the multiplication and division solver functions.
Function Description
mtimes_solve Performs the matrix multiplication for the function mtimes. Appendix B, row 454.
times_solve Performs the element‐by‐element multiplication for the function times and mtimes. Appendix B, row 498.
divide_solve Performs the array right division for the function mrdivide and rdivide. Appendix B, row 561.
11
3.7 Solving least squares problems
In table 3.9 functions for solving the least squares problems and orthogonal‐triangular decomposition is given.
Table 3.9: Description of functions for solving least squares problems.
Function Description
lssolve Solves the least squares problem for an algebraic equation with the Householder transformation. Appendix B, row 220.
qr Householder Orthogonal‐triangular decomposition, which results in Q and R. Where Q is orthogonal and R is upper triangular matrix with positive diagonal elements. Appendix B, row 260.
usolve Solves the equation for x, in U∙x = b, by backward substitution of the upper triangular matrix U [5.1]. Appendix B, row 651.
The functions in table 3.10 are private and used internally by the functions in table 3.9
Table 3.10: Intern functions for solving the lssolve and qr functions.
Function Description
qr_A Used internally in lssolve and qr to find the consisting of Q and R in a compact form [5.2]. Appendix B, row 670.
QT_b Used internally in lssolve to perform the operation (QT∙b) [5.3]. Appendix B, row 614.
qr_house Used internally in the function qr_A [5.4]. Appendix B, row 627.
qr_row_house Used internally in the functions qr_A and QT_b [5.3]. Appendix B, row 642.
qr_Q Used internally in qr to find Q [5.6]. Appendix B, row 670.
12
3.8 Integer controls
In table 3.11 private functions for integer controls is given.
Table 3.11: Description of the integer controls functions.
Function Description
is_integer Returns 1 if the incoming value is an integer, 0 otherwise. Appendix B, row 1054.
is_positive_integer Returns 1 if the incoming value is a positive integer, 0 otherwise. Appendix B, row 1064.
13
3.9 Miscellaneous functions
In table 3.12 some miscellaneous functions are given.
Table 3.12: Description of the miscellaneous functions.
Function Description
shift Shifts the bits to the left if a positive value is entered and to the right if the value is negative. Maximum number of shifts is 47. Appendix B, row 306.
set_fracwidth Sets a new fractional width, which will change the float value. Appendix B, row 387.
create_object Creates a new object which inherits the properties from the old object, but with the new entered value. Appendix B, row 847.
to_object Converts a float input to an object and returns the object. Appendix B, row 855.
sign Returns 1 if the entered value is greater then zero, 0 if the value is equal to zero and ‐1 if the value is less then zero. Appendix B, row 875.
subsref Overloads the default () indexing so that data in objects can be indexed [4]. Appendix B, row 412.
ind1 Returns a new object with the indexed cells from the input object. Can handle obj(i). Appendix B, row 881.
ind2 Returns a new object with the indexed cells from the input object. Can handle obj(m,n). Appendix B, row 890.
ind11 Returns data value with the indexed cells from the input object. Can handle obj(i). Appendix B, row 899.
ind22 Returns data value with the indexed cells from the input object. Can handle obj(m,n). Appendix B, row 907.
data_return Returns the data value from the object. Appendix B, row 915.
14
3.10 Test program
The purpose of the test program, Appendix C, is to validate that the code implementation is correct. The test is constructed to compare fp with the Fixed‐Point Toolbox. This is done by executing the same operations in both programs and comparing the results. The test is divided into different parts that execute different operations. The test program uses global properties which makes it possible to test every function in the program with different properties.
If the result of any part in the test differs, the test program shows both results and an error message. An example of this is showed in figure 3.2.
Figure 3.2: Test program, error message: Mrdivide matrix / scalar.
15
4 Simulations
In the main simulation the fixed‐point least squares solver in fp is compared to the standalone floating point version of the least squares solver in appendix F. The implemented code for the simulator can be seen in appendix D. The simulation data is then represented visually by the graph printer, which can be seen in appendix E.
The simulation is performed with bit widths of 8, 16, 24 and 32 together width fractional widths of 0, 4, 8, 12 and 16, with one bit allocated for the sign bit. The inputs are 250 random uniformly distributed 16x10 matrices. The matrices consist of real values within a range of ‐27 to 27‐2‐16, which is calculated from a bit width of 24 and a fractional width of 16.
The simulation calculates the differences of the least squares solutions between the fixed‐point and the floating point versions by using the 2‐norm. In table 4.1 are the statistics for the norm of the random matrices for different bit widths and fractional widths.
16
Table 4.1: Data statistics from simulation.
Bit width Fractional width Mean value Min value Max value Standard deviation
8 0 113,66105 2,75067 251,89520 63,71935
8 4 12,70712 1,96543 19,74751 2,96162
8 8 1,94026 0,93781 4,50070 0,48574
8 12 1,33268 0,50912 4,45649 0,54967
8 16 1,33819 0,52110 4,42600 0,54876
16 0 45,81804 2,24227 722,66654 82,82171
16 4 0,23123 0,06876 1,16822 0,15010
16 8 46,10575 1,58449 221,54608 59,18138
16 12 12,89724 1,73788 20,51517 2,99155
16 16 1,94514 0,68162 4,49931 0,50194
24 0 66,33105 2,69795 2020,59996 189,51837
24 4 0,23123 0,06876 1,16822 0,15010
24 8 0,01935 0,00468 0,10678 0,01283
24 12 0,00137 0,00028 0,01303 0,00110
24 16 47,40855 1,39667 221,54754 59,92337
32 0 66,33105 2,69795 2020,59996 189,51837
32 4 0,23123 0,06876 1,16822 0,15010
32 8 0,01935 0,00468 0,10678 0,01283
32 12 0,00137 0,00028 0,01303 0,00110
32 16 0,00008 0,00001 0,00043 0,00005
17
For a bit width of 8 the fractional width will be smaller than the bit width for two of the cases. This together with the low bit width gives results that are difficult to compare with the other bit widths. Therefore the cases for a bit width of 8 will be excluded from the analysis.
By looking at the mean values for the remaining bit widths it is seen that the best results is achieved when there are at least 11 integer bits and at least 4 fractional bits. This is due to that in a multiplication the product can become greater than the maximum value, if the product exceeds this value it will be saturated. To avoid saturation it is needed to double the bit width, which leads to 14 integer bits. The only case that has at least 14 integer bits in this simulation is the last case with a bit width of 32 and a fractional width of 16. As can be seen in table 4.1 this is the best result.
The maximum rounding error that can occur in each arithmetic operation is calculated as in equation (4.1).
max rounding error = ±2‐fractional width (4.1)
Since the matrices are created with a bit width of 24 all cases for that bit width is analyzed and is used for comparison with the other bit widths.
18
In figure 4.1 is a graph for the simulation with a bit width of 24 and a fractional width of 0. As can be seen the norm values are very high, this is due to the lack of fractional bits, which causing rounding errors.
Figure 4.1: Simulation for bit width 24 and fractional width 0.
19
By increasing the fractional width to 4 the norm is drastically decreased. The fractional width of 4 makes it possible to represent decimals. This increases the precision due to that the maximum rounding error becomes smaller.
Observe that the range of the y‐axis is changed; this also yields for the other graphs in this simulation.
Figure 4.2: Simulation for bit width 24 and fractional width 4.
20
To decrease the norm even further the fractional width is increased to 8. The fractional width of 8 makes it possible to represent more decimals, which increases the precision even further.
Figure 4.3: Simulation for bit width 24 and fractional width 8.
21
By increasing the fractional width to 12 the norms are decreased even further. The increased fractional width makes it possible to represent even more decimals, which increases the precision even further.
Figure 4.4: Simulation for bit width 24 and fractional width 12.
22
In figure 4.5 the fractional width is increased to 16. The fractional width of 16 makes it possible to represent even more decimals but this decrease the number of integer bits to 7. As discussed before the multiplication needs double the bit width to avoid saturation. Since there are only 7 integer bits a saturation is introduced. This is easily seen at the level of approximately 128 in figure 4.5. These levels also appear when a bit width of 16 and a fractional width of 8 are used, this again is due to the number of integer bits.
Figure 4.5: Simulation for bit width 24 and fractional width 16.
23
By analysing the figures 4.1 ‐ 4.5 it is seen that figure 4.4 shows the best result. To increase the precision even further the fractional width needs to be increased, but if the bit width is not increased the result will be as in figure 4.5. To avoid saturation the bit width also needs to be increased, as in figure 4.6. This case gives the best result for this simulation.
Figure 4.6: Simulation for bit width 32 and fractional width 16.
24
To get a better overview of the relation between the bit width and fractional width when solving least squares problems, a 3‐D illustration is shown in figure 4.7. The input to the simulation is 120 random 6x4 matrices, consisting of real values with a maximum value calculated from a bit width of 24 and a fractional width of 16. In figure 4.7 it is easily seen that if the fractional width is increased the bit width also needs to be increased to avoid the high peaks. The peaks appear when the difference between the bit width and the fractional width is 7. This is due to that there are only 6 integer bits plus one sign bit, which causing saturation. For example one of the peaks is appearing when the bit width is 16 and the fractional width is 9.
Figure 4.7: 3D‐illustration for varying bit widths and fractional widths.
25
5 Conclusions
The focus of this thesis has been to build a library of functions for handling fixed‐point data. A class that can handle the most common arithmetic operations and a least squares solver for fixed‐point has been implemented. In addition to the original task further functions have been added in consultation with the mentor.
In general calculations the class fp has the Fixed‐Point Toolbox as a reference but there are some important exceptions. In the Fixed‐Point Toolbox a division by zero results in the maximum data value. The class fp returns one in this case and is due to that DSP:s often handles this problem like this [7]. Another exception is that the arithmetic functions of the class fp round incoming float values. This is due to that a DSP receives rounded and saturated values to be able to handle arithmetic operations. This compared to the Fixed‐Point Toolbox which not do this rounding. The last exception is that the Fixed‐Point Toolbox rounds the quotient in a division to an integer, which in most cases results in an unnecessary precision loss. The class fp uses the original precision, i.e. no additional rounding to the quotient is performed.
The conclusion of the simulation made in this thesis is that the fractional width needs to be greater than zero due to rounding. The results are depending more on the number of integer bits, due to the need of double bit width in a multiplication. If the fractional width is increased to minimize rounding errors, the bit width also needs to be increased to avoid saturation.
The main simulation in this thesis took about ten hours to accomplish, due to this only one simulation has been performed and analyzed. This makes the conclusion of the simulation not as general as desired. More simulations have to be analyzed to make a more general conclusion; this could be done in a future work.
Some other ideas for future work are to make it possible to increase the precision within the arithmetic functions. Another idea is to implement an iterative method for solving least square problems. One method that accomplishes this is the Gauss‐Seidel method, which then should be compared to the Householder QR decomposition method.
26
27
6 References
[1] MathWorks, http://www.mathworks.com, May 2009.
[2] Fixed‐Point Toolbox, http://www.mathworks.com/products/fixed, May 2009.
[3] Floating point arithmetic, http://www.validlab.com/goldberg/paper.pdf, May 2009.
[4] Indexed Reference and Assignment, http://www.mathworks.com/access/helpdesk/help/techdoc/matlab_oop/br09eqz.html#br1b3ie, May 2009.
[5] Gene H. Golub, Charles F. Van Loan, Matrix Computations, Second edition, The John Hopkins University Press (1989), ISBN: 0‐8018‐3772‐3.
[5.1] Algorithm 3.1.2 (Back Substitution: Row Version), page 88.
[5.2] Algorithm 5.2.1 (Householder QR), page 212.
[5.3] Algorithm 5.3.2 (Householder LS Solution), page 226.
[5.4] Algorithm 5.1.1 (Householder Vector), page 196.
[5.5] Algorithm 5.1.2 (Householder Pre‐Multiplication), page 197.
[5.6] Example 5.15, Backward Accumulation, page 199.
[6] Ingemar Johansson, Senior Designer, Tieto, Discussion, April 2009.
Appendix A‐1
Appendix A: Glossary Word Description
2‐norm The 2‐norm is calculated as xxT ⋅ .
AM Amplitude modulation.
backward substitution
Backward substitution is used for solving upper triangular systems Ux = b.
binary point The binary point separates the integer part of the number from its fractional part in the binary numeral system (radix point base 2).
bit width Determines by how many bits the value should be represented.
channel The channel is the medium that the signal has to pass to get from the transmitter to the receiver.
class A class is a construct that uses a template to create an object. The template includes the properties that the object gets.
demodulate The demodulator extracts the original signal from the received signal.
DSP Digital Signal Processor. A digital signal processor is a microprocessor for digital signal processing.
fixed‐point In a fixed‐point value the radix point is fixed, i.e. the number of fraction bits is constant.
float See floating point.
floating point In a floating point value the binary point can float, i.e. the number of fraction bits can vary.
fractional bit The fractional bits are located to the right of the binary point and represent the fraction part of the number.
fractional width The number of fractional bits.
FM Frequency modulation.
Householder QR decomposition
The Householder QR decomposition is a method for solving the QR factorization witch can be used for solving least squares problems.
identity matrix The identity matrix is a square matrix with ones in the diagonal and zeros elsewhere.
Appendix A‐2
Word Description
least squares The least squares method is used to approximate overdetermined systems.
linear equation See linear system.
linear system A system is linear if ax + bx +cx = d, a non linear system is e.g. ax3 +bx2 + cx = d.
modulate The modulator combines the information signal with a carrier wave to be able to transmit.
msb The most significant bit is the bit with the highest value, i.e. the first bit from left.
object The object consists of the data and the different properties that are defined in the class.
orthogonal matrix A matrix is orthogonal if its transpose is equal to its inverse. QT∙Q = Q∙QT = I, where Q is the orthogonal matrix and I is the identity matrix.
PSK Phase‐shift keying.
QAM Quadrature amplitude modulation.
quantization error A quantization error is introduced by conversion from an analog to a digital signal. The error occurs due to rounding and truncation.
quantization Quantization is used when converting an analog signal to a digital signal.
real‐time In a real‐time system every step has a defined limited execution time.
upper triangular matrix
An upper triangular matrix is a square matrix where the entries below the main diagonal are zero.
Appendix B‐1
Appendix B: MATLAB code – Class
Appendix B‐2
Appendix B‐3
Appendix B‐4
Appendix B‐5
Appendix B‐6
Appendix B‐7
Appendix B‐8
Appendix B‐9
Appendix B‐10
Appendix B‐11
Appendix B‐12
Appendix B‐13
Appendix B‐14
Appendix B‐15
Appendix B‐16
Appendix B‐17
Appendix C‐1
Appendix C: MATLAB code – Test program
Appendix C‐2
Appendix C‐3
Appendix C‐4
Appendix C‐5
Appendix C‐6
Appendix C‐7
Appendix C‐8
Appendix D‐1
Appendix D: MATLAB code – Lssolve simulation
Appendix D‐2
Appendix E‐1
Appendix E: MATLAB code – Graph printer
Appendix E‐2
Appendix F‐1
Appendix F: MATLAB code – Lssolve float
Appendix F‐2