Select Page

# Working with Floating Point numbers in Java

Manoj Debnath
Published: January 30, 2023

There are two types of numbers we typically use for most purposes in programming: integers and floating-point numbers. Integers are non-fractional numbers that can be either positive or negative. The calculation and storage operations performed on these numbers are relatively simple. This makes working with them quick and efficient and is recommended wherever applicable to use them in all possible calculations. In Java, (depending on the data type), integers can range from -9,223,372,036,854,808 to 9,223,372,036,854,807.

Floating-point numbers, meanwhile, can be thought of as decimal point numbers, such as 2.0 or 14.321. They have special properties when used in application development, which can be a problem in terms of storage and operations. While developers may ignore the complexities of floating-point numbers, in a critical situation, a good grasp on their inner workings is indispensable, especially for programmers. In this programming tutorial we will specifically focus on floating-point numbers and how to use them effectively in the realm of Java software development.

## How Numbers Work in Java Memory

Java uses a binary number system to represent numbers. Different numeric types use a different amount of bits. Let’s assume computer memory is a 6×6 array where each row occupies a number of a fixed size. In this instance, we know that our 6×6 array can store a sign, plus five decimal digits. As we define a variable or a constant, the location assigned consists of five digits and a sign (+ or ). The storage of non-fractional numbers, such as an integer or a constant, is pretty straightforward, where the first location denotes the sign followed by five numbers (in our case). The largest and smallest possible numbers are of the range: -99,999 to +99,999.

```[+] – Largest +ve number
[+] – Zero
[-] – Largest -ve number
```

In this regard, one important thing to call out is precision, which stands for the maximum number of digits each memory store can hold. In our case, it is 5-digits, and we cannot store numbers greater than 99,999. To get around this, there is a trick developers can take advantage of: we can use the leftmost digit to represent an exponent, as shown here:

```[+] = 9999 x10^9
``` Using this method of exponentiation, programmers can represent a much greater range than -99,999 or +99999, such as 9,999,000,000,000 through +9,999,000,000,000. However the precision in this case is reduced to 4-digits. If the coding scheme is limited to 4 significant digits (in our hypothetical case) then the four leftmost digits are represented correctly and the rightmost digits – or the least significant digits – are lost (assumed to be 0).

## How to Represent Floating-point Numbers in Java

Storing a floating-point variable or a constant is not as simple as you might first suspect; that is because the number consists of a whole part and a fractional part. Memory stores only a finite number of digits (also called word). One method of representing floating-point numbers is to assume a fixed position for the decimal point and store all numbers (shift numbers appropriately if necessary) with an assumed decimal point. In such a convention, the maximum and minimum (in magnitude) numbers that can be stored are 9999.99 and 0000.01, respectively. The aim is to maintain the maximum range of values that can be stored in this scheme. In a nutshell, a floating-point number has two parts: exponent and mantissa. Sometimes the numbers have to be normalized (by discarding, rounding excess digits, etc.) to maintain a number close to the exact number within the restricted range (allowed word length). Consider the following case:

```9546 x 10^-2=95.46
35 x 10^-4=.0035
``` According to our hypothetical case, we can now represent any number between 9999 x 10^9 and +9999 x 10^9. This is accurate to 4 significant digit. Any number greater than, or less than, the range is discarded or has unpredictable results if used in mathematical calculations. For example, in this 4-significant digit scheme, numbers like 0.2056, -6.789, and 1000000 are represented exactly, but a number such as 123.0897 is not exact, because it has 7 significant digits in both the left and right side of the decimal place.

Therefore, it would normally be represented as 123.0, discarding other numbers after the decimal place.

In Java, the Java Virtual Machine (JVM) performs rounding of numbers rather than simple truncation of excess digits. This is not true for all programming languages, because there are systems that do not do any rounding. So, in our case, if we take the JVM scheme, the actual number to be stored is 123.1 and not 123.0. In any case, this is a significant loss of precision. Rounding of digits is a conscious attempt to make something more incorrect than the incorrect number itself. This is crucial in critical situations, such as in scientific observation or monetary transactions where precision matters. The point, however, is that in computer programming, this kind of error is unavoidable and there can only be a larger range of precision values to compensate. At some threshold, programmers must discard some numbers. This brings an air of unreliability associated with floating-point numbers and, as such, they are best avoided unless absolutely necessary.

Note: Never use floating-point numbers with conditional statements or loops. The rounding off of a number is an unavoidable error in numbers where precision matters. Java has many better options available for this purpose.

## Java Arithmetic Operations and Floating-point Numbers

In software development, two types of arithmetic operations are performed: integer arithmetic and real or floating-point arithmetic.

The results of integer arithmetic are exact, but floating-point arithmetic is not. In this section we will focus on floating-point arithmetic specifically. Since we now know that floating-point numbers are represented in two parts (exponent and mantissa), addition, subtraction, multiplication and division are done in the following manner. Assume that numbers are all normalized floating point numbers.

## Floating-point Normalization in Java

In Java, mantissas are shifted to the left until the most significant digits (leftmost digits) are non-zero. Normalization is performed to preserve the number of useful digits. For example, the number .005678, has two leading zeros that, if stored, would occupy unnecessary space. Because of this, it must be normalized as .5678 x 10^-2.

When two numbers are represented as normalized floating-point notation, the exponents of the two numbers must be made equal by shifting the mantissas appropriately, as shown in the following examples:  ### Floating-point Subtraction in Java

The principles of subtraction are the same and are nothing more than adding a negative number.

### Floating-point Multiplication in Java

Multiplication of two normalized floating-point numbers is performed by multiplying the mantissas and adding the exponent. Here are some examples of floating-point multiplication normalization with mantissa in Java: ### Java Floating-point Division

In case of division, the mantissa of the numerator is divided by that of the denominator. The exponent of the denominator is subtracted from the exponent of the numerator. The quotient obtained is finally normalized. Here are some examples of floating-point normalization and mantissa division in Java: ## How Java Handles Floating-point Numbers

As stated, computers use binary numbers rather than decimals. The idea, however, is the same. Different systems provide different precision magnitudes. Most systems use a two digit exponent for the smaller floating-point type and four digit for larger types. The significant digit for mantissa may be 6, 15, or 19 digits. Sometimes it depends upon the compiler build to decide the magnitude. The range and precision used by Java, according to the language specification, is given by the following formula:

```S x M x 2^e
```

Here, S is either +1 or -1 depending on the positive/negative number. M is a positive integer less than 2^24 and e denotes the range between -126 and 127, inclusive for the float types. And, for double types, M is less than 2^53, while e ranges -1022 to 1023 are inclusive. In fact, we can see/print the maximum and minimum range of the value. Java numeric classes, such as Integer, Float, or Double provide constants such as MAX_VALUE and MIN_VALUE. Here is a simple code example showing how Java handles floating-point values:

```package org.mano.example;

public class Main {

public static void main(String[] args) {
System.out.println("Minimum  integer value: "+ Integer.MIN_VALUE);
System.out.println("Maximum  integer value: "+Integer.MAX_VALUE);
System.out.println("Minimum  float value: "+Float.MIN_VALUE);
System.out.println("Maximum  float value: "+Float.MAX_VALUE);
System.out.println("Minimum  double value: "+Double.MIN_VALUE);
System.out.println("Maximum  double value: "+Double.MAX_VALUE);
}
}
```

Running this code in your integrated development environment (IDE) or code editor produces the following output:

```Minimum  integer value: -2147483648
Maximum  integer value: 2147483647
Minimum  float value: 1.4E-45
Maximum  float value: 3.4028235E38
Minimum  double value: 4.9E-324
Maximum  double value: 1.7976931348623157E308
```

## Final Thoughts on Java Floating-point Numbers

This was a quick programming tutorial covering the concepts behind floating-point numbers and how Java deals with them. Hopefully it clarifies the reason for the unpredictability associated with these types of numbers. Also, understand that the intricacies associated with storing floating-point numbers is a bit complex, as are their arithmetic operations.

Source: www.developer.com