I've pip installed boto3 on my local machine, and then I ran spark-submit in local mode while passing the path to the directory boto3 is installed in. Leaving me with the following command:
spark-submit --conf spark.driver.extraClassPath=/Library/Python/2.7/site-packages app.py
And then when import boto3 in my app.py, it throws the dreaded module not found error.
Is this the correct way to add a pip-installed python dependency to a spark-submit job?