Reading from Files
There are two APIs for reading from files:
hdfsRead()
and
hdfsPread()
. With both
hdfsRead()
and
hdfsPRead()
, you pass a pointer to a
buffer for the runtime to read bytes into and the length of the
buffer. There maximum length of the buffer is the maximum size of
the datatype that is used to specify the buffer length. The
datatype is a custom datatype: tSize
, a signed 32-bit
integer.
Both functions return the number of bytes that are actually read.
For an example of both APIs in action, see hdfs_read_revised.c
.
Using hdfsRead()
Whenever you open a file, the file pointer is
placed at offset 0. If you want to start reading at an offset other
than 0, call hdfsSeek()
to move the file
pointer forward to that offset before you call
hdfsRead()
.
When you call
hdfsSeek()
, you specify the offset as a
value of type tOffset
, which is a fixed-width, signed
64-byte integer type for storing offsets. tOffset
is
defined in hdfs.h
.
If a file is already open and you are not sure
what the current offset is, you can find out by calling
hdfsTell()
.
After hdfsRead()
finishes a read operation, the current offset is set to the last
byte read plus one.
Using hdfsPread()
With hdfsPread()
, you
specify the offset at which you want to start reading, so you don’t
first have to call hdfsSeek()
to move to
that offset.
However, the offset that you specify does not
change the current offset in the file. After
hdfsPread()
finishes the read operation,
the current offset is not set to the last byte read plus one.
Instead, the current offset remains as it was before the read
operation.