烧瓶中的Pyspark [英] Pyspark in Flask

查看:63
本文介绍了烧瓶中的Pyspark的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在这篇文章中,我正在尝试访问pyspark的解决方案从Flask应用访问Spark

I was trying the solution to access pyspark in this post Access to Spark from Flask app

但是当我在cmd中尝试此操作

but when I tried this in my cmd

 ./bin/spark-submit yourfilename.py

我知道

 '.' is not recognized as an internal or external command,
operable program or batch file.

对此有什么解决办法吗?

is there any solution to this?

我尝试将.py文件放置在bin文件夹中,然后运行spark-submit app.py结果如下:

I tried placing the .py file inside the bin folder and run spark-submit app.py here is the result:

Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
18/03/21 01:52:00 INFO SparkContext: Running Spark version 2.2.0
18/03/21 01:52:01 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
18/03/21 01:52:01 INFO SparkContext: Submitted application: app.py
18/03/21 01:52:01 INFO SecurityManager: Changing view acls to: USER
18/03/21 01:52:01 INFO SecurityManager: Changing modify acls to: USER
18/03/21 01:52:01 INFO SecurityManager: Changing view acls groups to:
18/03/21 01:52:01 INFO SecurityManager: Changing modify acls groups to:
18/03/21 01:52:01 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(USER); groups with view permissions: Set(); users  with modify permissions: Set(USER); groups with modify permissions: Set()
18/03/21 01:52:02 INFO Utils: Successfully started service 'sparkDriver' on port 62901.
18/03/21 01:52:02 INFO SparkEnv: Registering MapOutputTracker
18/03/21 01:52:02 INFO SparkEnv: Registering BlockManagerMaster
18/03/21 01:52:02 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
18/03/21 01:52:02 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
18/03/21 01:52:02 INFO DiskBlockManager: Created local directory at C:\Users\USER\AppData\Local\Temp\blockmgr-5504ca97-3578-4f22-9c0e-b5230bc02369
18/03/21 01:52:02 INFO MemoryStore: MemoryStore started with capacity 366.3 MB
18/03/21 01:52:02 INFO SparkEnv: Registering OutputCommitCoordinator
18/03/21 01:52:03 INFO Utils: Successfully started service 'SparkUI' on port 4040.
18/03/21 01:52:03 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://192.168.56.1:4040
18/03/21 01:52:03 INFO SparkContext: Added file file:/D:/opt/spark/spark-2.2.0-bin-hadoop2.7/bin/app.py at file:/D:/opt/spark/spark-2.2.0-bin-hadoop2.7/bin/app.py with timestamp 1521568323605
18/03/21 01:52:03 INFO Utils: Copying D:\opt\spark\spark-2.2.0-bin-hadoop2.7\bin\app.py to C:\Users\USER\AppData\Local\Temp\spark-de856657-5946-4d4f-a7ea-9c2740c88add\userFiles-8c88d3b0-5a05-4c54-861d-ce4397ed0bd5\app.py
18/03/21 01:52:04 INFO Executor: Starting executor ID driver on host localhost
18/03/21 01:52:04 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 62910.
18/03/21 01:52:04 INFO NettyBlockTransferService: Server created on 192.168.56.1:62910
18/03/21 01:52:04 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
18/03/21 01:52:04 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 192.168.56.1, 62910, None)
18/03/21 01:52:04 INFO BlockManagerMasterEndpoint: Registering block manager 192.168.56.1:62910 with 366.3 MB RAM, BlockManagerId(driver, 192.168.56.1, 62910, None)
18/03/21 01:52:04 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 192.168.56.1, 62910, None)
18/03/21 01:52:04 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, 192.168.56.1, 62910, None)
18/03/21 01:52:06 INFO SparkContext: Invoking stop() from shutdown hook
18/03/21 01:52:06 INFO SparkUI: Stopped Spark web UI at http://192.168.56.1:4040
18/03/21 01:52:06 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
18/03/21 01:52:06 INFO MemoryStore: MemoryStore cleared
18/03/21 01:52:06 INFO BlockManager: BlockManager stopped
18/03/21 01:52:06 INFO BlockManagerMaster: BlockManagerMaster stopped
18/03/21 01:52:06 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
18/03/21 01:52:06 INFO SparkContext: Successfully stopped SparkContext
18/03/21 01:52:06 INFO ShutdownHookManager: Shutdown hook called
18/03/21 01:52:06 INFO ShutdownHookManager: Deleting directory C:\Users\USER\AppData\Local\Temp\spark-de856657-5946-4d4f-a7ea-9c2740c88add\pyspark-5cc3aa4c-3890-4600-b4c0-090e179c18eb
18/03/21 01:52:06 INFO ShutdownHookManager: Deleting directory C:\Users\USER\AppData\Local\Temp\spark-de856657-5946-4d4f-a7ea-9c2740c88add

我再也无法运行flask应用了

I dont get to run the flask app anymore

推荐答案

 '.' is not recognized as an internal or external command,
 operable program or batch file.

以上错误是因为您正在Windows中执行,尝试使用绝对路径或设置SPARK_HOME环境变量并调用bin \ spark-submit.该错误将消失

Above error is because you are executing in windows, try with absolute path or set SPARK_HOME environment variable and invoke bin\spark-submit. This error will go away

对于flask应用,请按以下方式运行

For flask app, run as below, it should work

set FLASK_APP=yourfilename.py
flask run

您可以使用来访问您的应用

You can access your app using

http://127.0.0.1:5000/path

这篇关于烧瓶中的Pyspark的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆