I like this this idea in general. When running under orchestrators like
Yarn, Marathon, or Kubernetes, it's true that those things that start drill
"manage" memory, however, there exists issues in that you need to setup the
variables in drill to not exceed the amount that orchestrators have
allocated. Once an orchestrator sees a managed process overtake what it
allocated, it often kills the process. In Drill that can mean drillbits
that get killed during queries and thus that leads to a bad user
experience. Folks configuring Drill in the field have had to set the Heap,
Direct and other settings and hope that they did it right to ensure this
This option, provides a way for people to start working with reasonable
settings, I like the by % or absolute values. This is important in
I think I saw in the JIRA that Drill will indicate at startup what
allocation was used, based on what variables. I think this is important.
Log at bit start both in stdout and in the drillbit.log file. Indicate
what method was used for allocation, what the user provided values were,
and for auto allocations the split provided. (maybe even provide it in such
a way, that if if a user read it, and wanted to tweak, they could take the
auto allocated output message, and cut and paste that into drill_env.sh.
I.e. print the variables and the values that got auto allocated. That way,
as an administrator, if I felt the need to tweak settings, I can take
exactly what the auto-allocation outputted, put it into my env script, and
then tweak to my hearts desire.
This is a pretty cool..
On Fri, Feb 16, 2018 at 1:15 PM, Kunal Khatua <[EMAIL PROTECTED]> wrote: