如何在Jupyter Notebook调试中PySpark代码 [英] How to PySpark Codes in Debug Jupyter Notebook
问题描述
我想知道我可以在Jpyter笔记本中调试pyspark代码吗?我已经尝试使用ipdb模块在Jupyter中针对常规python代码解决方案。
I am wondering I can debug pyspark codes in Jpyter notebook? I have tried the solution for regular python codes in Jupyter using ipdb module.
但是它不能与带有pyspark内核的笔记本一起使用。.
But it is not working in with notebook with pyspark kernel..
请注意:我的问题是关于在Jupypter笔记本中而不是在ItelliJ IDE或任何其他python IDE中调试pyspark。
Please note that: My question is about debugging pyspark within Jupypter notebook and not in ItelliJ IDE or any other python IDEs.
背景:
- 我在MacOS优胜美地上。
- 我的Spark版本是1.6.2
- Jupyter内核是:Apache Toree PySpark
- 我有ipdb
- I am on MacOS yosemite.
- My spark version is 1.6.2
- Jupyter kernel is:Apache Toree PySpark
- I have ipdb installed.
任何帮助将不胜感激。
推荐答案
如果您想在Jyupter笔记本中玩耍并调试PySpark代码,请在安装并设置Spark之后(在此处向您展示如何操作的好指南: https://blog.sicara.com/get-started-pyspark-jupyter-guide-tutorial-ae2fe84f594f ),您可以导入SparkSession并创建本地实例:
In Jyupter notebook if you want to play around and debug PySpark code, once Spark is installed and set up (good guide to show you how here: https://blog.sicara.com/get-started-pyspark-jupyter-guide-tutorial-ae2fe84f594f) you can import SparkSession and create a local instance:
from pyspark.sql import SparkSession
spark = SparkSession.builder.master("local[1]").appName("pyspark-test").getOrCreate()
df = spark.read.csv("test.csv", header=True)
这篇关于如何在Jupyter Notebook调试中PySpark代码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!