Objects of typeScanner
are useful for breaking down formatted input into tokens and translating individual tokens according to their data type.Breaking Input into Tokens
By default, a scanner uses white space to separate tokens. (White space characters include blanks, tabs, and line terminators. For the full list refer to the documentation forCharacter.isWhitespace
.) To see how scanning works, let's look atScanXan
, a program that reads the individual words inxanadu.txt
and prints them out, one per line.Notice thatimport java.io.*; import java.util.Scanner; public class ScanXan { public static void main(String[] args) throws IOException { Scanner s = null; try { s = new Scanner(new BufferedReader(new FileReader("xanadu.txt"))); while (s.hasNext()) { System.out.println(s.next()); } } finally { if (s != null) { s.close(); } } } }ScanXan
invokesScanner
'sclose
method when it is done with the scanner object. Even though a scanner is not a stream, you need to close it to indicate that you're done with its underlying stream.The output of
ScanXan
looks like this:To use a different token separator, invokeIn Xanadu did Kubla Khan A stately pleasure-dome ...useDelimiter()
, specifying a regular expression. For example, suppose you wanted the token separator to be a comma, optionally followed by white space. You would invoke,s.useDelimiter(",\\s*");Translating Individual Tokens
TheScanXan
example treats all input tokens as simpleString
values.Scanner
also supports tokens for all of the Java language's primitive types (except forchar
), as well asBigInteger
andBigDecimal
. Also, numeric values can use thousands separators. Thus in aUS
locale,Scanner
correctly reads the string "32,767" as representing an integer value.We have to mention the locale, because thousands separators and decimal symbols are locale-specific. So the following example would not work correctly in all locales if we didn't specify that the scanner should use the
US
locale. That's not something you usually have to worry about, because your input data usually comes from sources that use the same locale as you do. But this example is part of the Java Tutorial, and gets distributed all over the world.The
ScanSum
example reads a list ofdouble
values and adds them up. Here's the source:And here's the sample input file,import java.io.FileReader; import java.io.BufferedReader; import java.io.IOException; import java.util.Scanner; import java.util.Locale; public class ScanSum { public static void main(String[] args) throws IOException { Scanner s = null; double sum = 0; try { s = new Scanner( new BufferedReader(new FileReader("usnumbers.txt"))); s.useLocale(Locale.US); while (s.hasNext()) { if (s.hasNextDouble()) { sum += s.nextDouble(); } else { s.next(); } } } finally { s.close(); } System.out.println(sum); } }usnumbers.txt
The output string is "1032778.74159". The period will be a different character in some locales, because8.5 32,767 3.14159 1,000,000.1System.out
is aPrintStream
object, and that class doesn't provide a way to override the default locale. We could override the locale for the whole program — or we could just use formatting, as described in the next topic, Formatting.