Unfortunately, I am unable to see, how option 2 could be done from beam
side? Or is it meant to delay to the end user?

Let me try to elaborate:

Assume user implements a pipeline with 2 different IOs, both depending on
incompatible versions of e.g. guava. Also, user code has a (direct or
transitive) dependency to
yet another incompatible version of guava. I ll try to get into more
details later, but for now assume

     AIO   ----     ParDo    ----    ParDo   ----     Pardo   ----    BIO
       |                                          |
    guava-X.jar                       guava-Y.jar

What currently happens (AFAIU) is during build we do pin the versions of
guava to some fixed version [1], which means AIO and BIO are compiled with
guava-20, which might or might not
be ABI compatible to guava-X and/or guava-Z. (This is the problem these 2
users encountered, right?)

Now the user trying to run this pipeline has to decide, which version to
use. Whatever she chooses the might be incompatibilities, She could do so
with maven, and of course also with Gradle or any other build system

If we now replace guava.20 with the newest on build, lets say guava-LATEST
which happens to be a dependency of CIO, how would that change the game. So
is there anything beam could do about it?

Now I try to understand those deps in more detail. We have those IOs, and
the user code thingy. I assume, beam code switched to vendors guava, so
nothing is exposed here, we only have issues on external interfaces as IOs.

1. The user code.
Whether guava is used directly or transitive by some used library
dependency, there is nothing beam can do about it. The user has to solve
this issue by herself. She will probably encounter exactly the same
problems as beam within the IOs,

2. IOs

a: Beam IO code uses guava, no transitive deps. This is solved by
vendoring, right?

   SomeIO <--- guava-20.jar           is replaced by
 SomeIO <--- vendored_guava_v20.ja

b: Beam uses some client lib (e.g. some-client-core) which has a dependency
on guava but does not expose any guava class to the outside

   SomeIO <---- some-client-lib <---- guava-X.jar

 In this case beam 'could' relocate the client lib and 'solve' that guava
exposing on the class path, *but* we might be tight to a specific version
of the backend. as the user can not replace the 'some-client-core'
dependency with a different version (hoping there are no interop issues
with beams IO)
I guess, we are better off, not doing that? Which unfortuantely means
delaying version resolution to the end user, as transitive dependency will
end up on the end user runtime classpath and might cause diamond dependency

c: Beam uses some client lib (e.g. bigquery-client-core) which has a
dependency on guava which does expose guava classes in its external API

   SomeIO <---- some-client-lib <---- guava-X.jar
      |                                                      |

We could of course also try to shade here with above drawbacks, but this
might not even work as the client lib communicates with the backend and
might also expose guava there. So the best we could do is, as Kenn stated,
to ensure interoperability SomeIO with the client.

So to my current understanding I second Kenn here. We should stop forcing
a specific version of (of course uncensored) guava and align on IO basis to
the requirements of dependencies. Still, on that lower (IO) level we might
need to force some version as already there we might have diamond deps on
guava. But I think we are better of not to force a beam global version here.

Of course, that will force the user to handle the problem, but she needs to
do it anyway, and, unfortunately me also fails to see anything we could do
better here. (At least current forcing to v20 seems to worsen the problem).


On Tue, Mar 5, 2019 at 11:23 PM Gleb Kanterov <[EMAIL PROTECTED]> wrote: