I have my own Python library that I would like to use in OpenRefine as described here
However, it seems that all the Python code in OpenRefine goes through Jython which supports only Python 2
Is there a way to run Python3 code in OpenRefine?
cheers
I have my own Python library that I would like to use in OpenRefine as described here
However, it seems that all the Python code in OpenRefine goes through Jython which supports only Python 2
Is there a way to run Python3 code in OpenRefine?
cheers
Short answer: NO. Openrefine uses Jython, which is currently based on python 2.7 and there is no immediate or short term plans to move to 3.X versions.
BUT.
There is a trick to do this, as soon as you have python3 installed on your machine. Python2 allows the execution of a command-line script/tool, and collecting the result.
This simple python2 script will do that :
# This jython2.7 script has to be executed as jython, not GREL
# It allows you to execute a command (CLI) in the terminal and retrieve the result.
# import basic librairies
import time
import commands
import random
# get status and output of the command
status, output = commands.getstatusoutput(value)
# add a random between 2 and 5s pause to avoid ddos on servers... Be kind to APIs!
time.sleep(random.randint(2, 5))
# returns the result of the command
return output.decode("utf-8")
I use it to execute local python3 scripts,but also dig, curls, etc...
Use case : Suppose I have a bunch of internet domains in column A. I wan to perform a dig SOA command on these domains.
This script is pure python2, doesn't rely on extra libs and should be working forever.
Disclaimer: execution of local code by a third-party app should be done cautiously.
I needed something like that (had to "guess" the language the text of one column was written), and, what I found to be a nice solution, and worked quite fast (with some "extra features" easily added) was to wrap my python3 program as a flask web API (took, literally 10 minutes), and use it from OpenRefine with "Add column by fetching URL".
The added bonus is that it was rather easy to run it in the fastest machine we had on site, adding cache, etc.
The only thing that I would like to have seen improved (on OpenRefine's side) is the ability to, optionally, fetch several URLs in parallel, then you could run several flask instances on several machines, and speed it up a little.