Subject: Huge difference in speed between pyspark and scalaspark


This comparison comes up time and time again. Spark is written in
Scala and provides
APIs in Scala, Java, Python, and R.

However, its primary focus has been on Scala. In generic terms this means
that Python, Java etc are add-ons and I suspect if you look under the
bonnet they  interface with Scala.

Hence that would be a driver for Spark on Scala being fastest. The question
is it is what it is. So if you are going to use Python then expect that
behaviour to materialise.

HTH,
Dr Mich Talebzadeh

LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*

http://talebzadehmich.wordpress.com
*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.
On Wed, 13 May 2020 at 14:30, Gerard Maas <[EMAIL PROTECTED]> wrote: