Hi Animesh,

Firstly I would like to suggest switching over to Arrow 0.8 release asap
since you are writing JAVA programs and the API usage has changed
drastically. The new APIs are much simpler with good javadocs and detailed
internal comments.

If you are writing stop-gap implementation then it is probably fine to
continue with old version but for long term new API usage is recommended.
   - Create an instance of the vector. Note that this doesn't allocate any
   memory for the elements in the vector
   - Grab the corresponding mutator and accessor objects by calls to
   getMutator(), getAccessor().
   - Allocate memory
      - *allocateNew()* - we will allocate memory for default number of
      elements in the vector. This is applicable to both fixed width
and variable
      width vectors.
      - *allocateNew(valueCount)* -  for fixed width vectors. Use this
      method if you have already know the number of elements to store in the
      - *allocateNew(bytes, valueCount)* - for variable width vectors. Use
      this method if you already know the total size (in bytes) of all the
      variable width elements you will be storing in the vector. For
example, if
      you are going to store 1024 elements in the vector and the total size
      across all variable width elements is under 1MB, you can call
      allocateBytes(1024*1024, 1024)
   - Populate the vector:
      - Use the *set() or setSafe() *APIs in the mutator interface. From
      Arrow 0.8 onwards, you can use these APIs directly on the vector instance
      and mutator/accessor are removed.
      - The difference between set() and corresponding setSafe() API is
      that latter internally takes care of expanding the vector's buffer(s) for
      storing new data.
      - Each set() API has a corresponding setSafe() API.
   - Do a setValueCount() based on the number of elements you populated in
   the vector.
   - Retrieve elements from the vector:
      - Use the get(), getObject() APIs in the accessor interface. Again,
      from Arrow 0.8 onwards you can use these APIs directly.
   - With respect to usage of setInitialCapacity:
      - Let's say your application always issues calls to allocateNew(). It
      is likely that this will end up over-allocating memory because
it assumes a
      default value count to begin with.
      - In this case, if you do setInitialCapacity() followed by
      allocateNew() then latter doesn't do default memory allocation. It does
      exactly for the value capacity you specified in setInitialCapacity().

I would highly recommend taking a look at
This has lots of examples around populating the vector, retrieving from
vector, using setInitialCapacity(), using set(), setSafe() methods and a
combination of them to understand when things can go wrong.

Hopefully this helps. Meanwhile we will try to add some internal README for
the usage of vectors.


On Tue, Dec 19, 2017 at 8:55 AM, Emilio Lahr-Vivaz <[EMAIL PROTECTED]>