Sunday, 28 September 2014

IO in Java (Using Scanner and BufferedReader)

Background

This is one of the most basic questions in Java. How do we take input and print output in Java. There are multiple ways to do so. Mostly everyone would use Scanner or BufferedReader for input and System.out.println and PrintWriter for output without considering it's use case. Programmers who appear for coding competitions will understand what I am taking about. How do you do IO operations define your efficiency in such competitions and you may end up getting what is known as TLE (Time limit Exceeded). So in this post lets discuss what should we use for IO when and why.


Taking Input in Java

There are two most common ways to take the input - 

  1. Scanner :

    In scanner you can do something like -

    Scanner scanner = new Scanner(System.in, ",");
    String stringInput  = scanner.nextLine();
    int intInput = scanner.nextInt();
    


    Few points to note here

    • Scanner is used for parsing tokens from underlying Stream. In above case we have given standard output stream (System.in) to the Scanner object. 
    • Scanner also can tokenize your stream based on a delimiter. In above case we have provided comma(',') as a delimiter. Default delimiter is space.
    • As of Java 6 Scanner has a buffer size of 1024 characters.
    • Also Scanner is not synchronized meaning it is not thread safe. So you should not use this when multiple threads are involved.

  2. BufferedReader :

    In case of Buffered Reader you do something like -

    BufferedReader reader = new BufferedReader(new InputStreamReader(System.in));
    String line = null;
    while((line = reader.readLine()) != null) {
         System.out.println("Line read : " + line);
    }
    


    Few points to note here

    • Buffered Reader simply reads the Stream.
    • As of java 6 BufferedReader has a buffer size of 8192 characters.
    • BufferedReader is synchronized, so read operations on a BufferedReader can safely be done from multiple threads.

Comparing Scanner and BufferedReader

  • Scanner parses the token from underlying Stream where as Buffered Reader simply reads the Stream. In fact you can pass BufferedReader (ReadableSource) to Scanner.

    Scanner scanner = new Scanner(new BufferedReader(new InputStreamReader(System.in)));

  • As mentioned in above points BufferedReader has more buffer size (8192 characters) than Scanner (1024 characters).

  • BufferedReader is faster than Scanner as Scanner parses the token after reading it. So if you notice in coding competitions programmers generally use BufferedReader than Scanner. Only drawback being you have to take care of input formats.

    Infact as per code chef's  IO guidelines -

    Scanner is easily the most convenient way of reading in input, however it is very slow and not recommended unless the input is very small. ( less than 50KB of data is to be read )

    For example if input is an integer you may have to do.

    int data = Integer.parseInt(reader.readLine());
  •  

Why wrap with BufferedReader ?

Generally, each read request made of a Reader like a FileReader causes a corresponding read request to be made to underlying stream. Each invocation of read() or readLine() could cause bytes to be read from the file, converted into characters, and then returned, which can be very inefficient. 

FileReader fileReader = new FileReader(new File("data.txt"));
char[] data = new char[10];

fileReader.read(data); 

Efficiency is improved appreciably if a Reader is warped in a BufferedReader.

FileReader fileReader = new FileReader(new File("data.txt"));

BufferedReader bufferedReader= new BufferedReader(fileReader);

String line = null;

while((line = reader.readLine()) != null) {

    System.out.println("Line read : " + line);

}


Printing output in java

This may not seem to be such a big deal but it infact is. Use PrintWriter than using System.out.println as PrintWriter is faster than the other to print data to the console.

PrintWriter writer = new PrintWriter(System.out,true);
writer.println("Hi there!");

The System.out variable is referencing an object of type PrintStream which wraps a BufferedOutputStream (at least in Oracle JDK 7). When you call one of the printX() or write() methods on PrintStream, it internally flushes the buffer of the underlying BufferedOutputStream.

That doesn't happen with PrintWriter. You have to do it yourself. 

Alternatively, you can create a PrintWriter with an autoFlush property set to true which will flush on each write as I have done in example above.

Some other difference between System.out.println (PrintStream) and PrintWriter

  • PrintStream is a stream of bytes while PrintWriter is a stream of characters.
  • PrintStream uses platform's default encoding while with the PrintWriter you can however pass an OutputStreamWriter with a specific encoding.

    PrintStream stream = new PrintStream(output); 
    PrintWriter writer = new PrintWriter(new OutputStreamWriter(output, "UTF-8"));

  • As mentioned above PrintStream methods internally auto flushes the underlying BufferedOutputStream.
  • Note : autoFlush - A boolean; if true, the println, printf, or format methods will flush the output bufferSo if you do -

    PrintWriter writer = new PrintWriter(System.out,true);        
    writer.write("Hello World!");
    


    it will not cause auto flush. You will have to call flush explicitly.

    PrintWriter writer = new PrintWriter(System.out,true);
     writer.write("Hello World!");
     writer.flush();
    

Pictorial representation of IO



 

Related Links

t> UA-39527780-1 back to top